Derivative-informed neural operator acceleration of geometric MCMC for infinite-dimensional Bayesian inverse problems (2403.08220v2)
Abstract: We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse problems (BIPs). While geometric MCMC employs high-quality proposals that adapt to posterior local geometry, it requires repeated computations of gradients and Hessians of the log-likelihood, which becomes prohibitive when the parameter-to-observable (PtO) map is defined through expensive-to-solve parametric partial differential equations (PDEs). We consider a delayed-acceptance geometric MCMC method driven by a neural operator surrogate of the PtO map, where the proposal exploits fast surrogate predictions of the log-likelihood and, simultaneously, its gradient and Hessian. To achieve a substantial speedup, the surrogate must accurately approximate the PtO map and its Jacobian, which often demands a prohibitively large number of PtO map samples via conventional operator learning methods. In this work, we present an extension of derivative-informed operator learning [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)] that uses joint samples of the PtO map and its Jacobian. This leads to derivative-informed neural operator (DINO) surrogates that accurately predict the observables and posterior local geometry at a significantly lower training cost than conventional methods. Cost and error analysis for reduced basis DINO surrogates are provided. Numerical studies demonstrate that DINO-driven MCMC generates effective posterior samples 3--9 times faster than geometric MCMC and 60--97 times faster than prior geometry-based MCMC. Furthermore, the training cost of DINO surrogates breaks even compared to geometric MCMC after just 10--25 effective posterior samples.
- Predictive computational science: Computer predictions in the presence of uncertainty, in: Encyclopedia of Computational Mechanics, John Wiley & Sons, Ltd, 2nd edition, 2017, pp. 1–26.
- D. P. Kouri, A. Shapiro, Optimization of pdes with uncertain inputs, Frontiers in PDE-constrained optimization (2018) 41–81.
- O. Ghattas, K. Willcox, Learning physics-based models from data: Perspectives from inverse problems and model reduction, Acta Numerica 30 (2021) 445–554.
- A. Alexanderian, Optimal experimental design for infinite-dimensional bayesian inverse problems governed by pdes: a review, Inverse Problems 37 (2021) 043001.
- Equation of state calculations by fast computing machines, The Journal of Chemical Physics 21 (1953) 1087–1092.
- W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika 57 (1970) 97–109.
- G. O. Roberts, J. S. Rosenthal, General state space Markov chains and MCMC algorithms, Probability Surveys 1 (2004) 20–71.
- M. Girolami, B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (2011) 123–214.
- A stochastic Newton MCMC method for large-scale statistical inverse problems with application to seismic inversion, SIAM Journal on Scientific Computing 34 (2012) A1460–A1487.
- K. Law, Proposals which speed up function-space MCMC, Journal of Computational and Applied Mathematics 262 (2014) 127–138. Selected Papers from NUMDIFF-13.
- T. Bui-Thanh, M. Girolami, Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo, Inverse Problems 30 (2014) 114014.
- Emulation of higher-order tensors in manifold Monte Carlo methods for Bayesian inverse problems, Journal of Computational Physics 308 (2016) 81–101.
- Geometric MCMC for infinite-dimensional inverse problems, Journal of Computational Physics 335 (2017) 327–351.
- S. Lan, Adaptive dimension reduction to accelerate infinite-dimensional geometric Markov chain Monte Carlo, Journal of Computational Physics 392 (2019) 71–95.
- J. A. Christen, C. Fox, Markov chain Monte Carlo using an approximation, Journal of Computational and Graphical Statistics 14 (2005) 795–810.
- Preconditioning Markov chain Monte Carlo simulations using coarse-scale models, SIAM Journal on Scientific Computing 28 (2006) 776–803.
- Multilevel delayed acceptance MCMC, SIAM/ASA Journal on Uncertainty Quantification 11 (2023) 1–30.
- Neural operator: Learning maps between function spaces with applications to PDEs, Journal of Machine Learning Research 24 (2023) 1–97.
- Operator learning: Algorithms and analysis, arXiv preprint, arXiv.2402.15715 (2024).
- Efficient PDE-constrained optimization under high-dimensional uncertainty using derivative-informed neural operators, arXiv preprint arXiv.2305.20053 (2023).
- Derivative-Informed Neural Operator: An efficient framework for high-dimensional parametric derivative learning, Journal of Computational Physics 496 (2024) 112555.
- MCMC methods for functions: Modifying old algorithms to make them faster, Statistical Science 28 (2013) 424–446.
- J. Hesthaven, S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, Journal of Computational Physics 363 (2018) 55–78.
- S. Fresca, A. Manzoni, POD-DL-ROM: Enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition, Computer Methods in Applied Mechanics and Engineering 388 (2022) 114181.
- Model reduction and neural network for parametric PDEs, The SMAI Journal of computational mathematics 7 (2021) 121–157.
- Learning high-dimensional parametric maps via reduced basis adaptive residual networks, Computer Methods in Applied Mechanics and Engineering 402 (2022a) 115730.
- Derivative-informed projected neural networks for high-dimensional parametric maps governed by PDEs, Computer Methods in Applied Mechanics and Engineering 388 (2022b) 114199.
- DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators, Nature Machine Intelligence (2021).
- Variational autoencoding neural operators, arXiv preprint, arXiv.2302.10351 (2023).
- Fourier neural operator for parametric partial differential equations, arXiv preprint, arXiv.2010.08895 (2021).
- LNO: Laplace neural operator for solving differential equations, arXiv preprint, arXiv.2303.10528 (2023).
- The nonlocal neural operator: Universal approximation, arXiv preprint, arXiv.2304.13221 (2023).
- Neural operator: Graph kernel network for partial differential equations, arXiv preprint, arXiv.2003.03485 (2020a).
- Multipole graph neural operator for parametric partial differential equations, in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020b, pp. 6755–6766.
- The cost-accuracy trade-off in operator learning with neural networks, arXiv preprint, arXiv:2203.13181 (2022).
- A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data, Computer Methods in Applied Mechanics and Engineering 393 (2022) 114778.
- Physics-informed neural operator for learning partial differential equations, ACM / IMS Journal of Data Science (2024).
- Learning the solution operator of parametric partial differential equations with physics-informed DeepONets, Science Advances 7 (2021) eabi8605.
- J. Go, P. Chen, Accelerating Bayesian optimal experimental design with derivative-informed neural operators, arXiv preprint arXiv:2312.14810 (2023).
- Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions, The Annals of Applied Probability 24 (2014) 2455–2490.
- Dimension-independent likelihood-informed MCMC, Journal of Computational Physics 304 (2016) 109–137.
- T. Bui-Thanh, Q. P. Nguyen, FEM-based discretization-invariant MCMC methods for PDE-constrained Bayesian inverse problems, Inverse Problems and Imaging 10 (2016) 943–975.
- D. Rudolf, B. Sprungk, On a generalization of the preconditioned Crank–Nicolson Metropolis algorithm, Foundations of Computational Mathematics 18 (2018) 309–343.
- A computational framework for infinite-dimensional Bayesian inverse problems, Part II: Stochastic Newton MCMC with application to ice sheet flow inverse problems, SIAM Journal on Scientific Computing 36 (2014) A1525–A1555.
- Algorithms for Kullback–Leibler approximation of probability measures in infinite dimensions, SIAM Journal on Scientific Computing 37 (2015) A2733–A2757.
- Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM Journal on Scientific Computing 36 (2014) A1500–A1524.
- Accelerating Markov chain Monte Carlo with active subspaces, SIAM Journal on Scientific Computing 38 (2016) A2779–A2805.
- hIPPYlib-MUQ: A Bayesian inference software framework for integration of data with complex predictive models under uncertainty, ACM Trans. Math. Softw. 49 (2023).
- Accelerating asymptotically exact MCMC for computationally intensive models via local approximations, Journal of the American Statistical Association 111 (2016) 1591–1607.
- B. Peherstorfer, Y. Marzouk, A transport-based multifidelity preconditioner for Markov chain Monte Carlo, Advances in Computational Mathematics 45 (2019) 2321–2348.
- Complexity analysis of accelerated MCMC methods for Bayesian inversion, Inverse Problems 29 (2013) 085010.
- Multilevel sequential2 Monte Carlo for Bayesian inverse problems, Journal of Computational Physics 368 (2018) 154–178.
- Multilevel Markov chain Monte Carlo, SIAM Review 61 (2019) 509–545.
- Multilevel dimension-independent likelihood-informed MCMC for large-scale inverse problems, Inverse Problems 40 (2024) 035005.
- A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica 19 (2010) 451–459.
- L. Tierney, A note on Metropolis–Hastings kernels for general state spaces, The Annals of Applied Probability 8 (1998) 1–9.
- Gradient-based data and parameter dimension reduction for Bayesian models: An information theoretic perspective, arXiv preprint, arXiv.2207.08670 (2022).
- Gradient-based dimension reduction of multivariate vector-valued functions, SIAM Journal on Scientific Computing 42 (2020) A534–A558.
- T. Cui, O. Zahm, Data-free likelihood-informed dimension reduction of Bayesian inverse problems, Inverse Problems 37 (2021) 045009.
- A computational framework for infinite-dimensional Bayesian inverse problems part I: The linearized case, with application to global seismic inversion, SIAM Journal on Scientific Computing 35 (2013) A2494–A2523.
- hIPPYlib: An extensible software framework for large-scale inverse problems governed by PDEs: Part I: Deterministic inversion and linearized Bayesian inference, ACM Transactions on Mathematical Software 47 (2021).
- A survey of direct methods for sparse linear systems, Acta Numerica 25 (2016) 383–566.
- Residual-based error correction for neural operator accelerated infinite-dimensional Bayesian inverse problems, Journal of Computational Physics 486 (2023) 112104.
- C. Schwab, J. Zech, Deep learning in high dimension: Neural network expression rates for analytic functions in 𝑳𝟐(ℝ𝒅,𝜸_𝒅)superscript𝑳2superscriptℝ𝒅𝜸bold-_𝒅\boldsymbol{L^{2}(\mathbb{R}^{d},\gamma\_d)}bold_italic_L start_POSTSUPERSCRIPT bold_2 end_POSTSUPERSCRIPT bold_( blackboard_bold_R start_POSTSUPERSCRIPT bold_italic_d end_POSTSUPERSCRIPT bold_, bold_italic_γ bold__ bold_italic_d bold_), SIAM/ASA Journal on Uncertainty Quantification 11 (2023) 199–234.
- Data-driven model reduction for the Bayesian solution of inverse problems, International Journal for Numerical Methods in Engineering 102 (2015) 966–990.
- P. Whittle, On stationary processes in the plane, Biometrika 41 (1954) 434–449.
- MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Problems 29 (2013) 095017.
- S. P. Brooks, A. Gelman, General methods for monitoring convergence of iterative simulations, Journal of Computational and Graphical Statistics 7 (1998) 434–455.
- D. Dowson, B. Landau, The Fréchet distance between multivariate normal distributions, Journal of Multivariate Analysis 12 (1982) 450–455.
- The FEniCS project version 1.5, Archive of Numerical Software 3 (2015).
- hIPPYlib: An extensible software framework for large-scale inverse problems, The Journal of Open Source Software 3 (2018) 940.
- Multifidelity dimension reduction via active subspaces, SIAM Journal on Scientific Computing 42 (2020) A929–A956.