Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Derivative-informed neural operator acceleration of geometric MCMC for infinite-dimensional Bayesian inverse problems (2403.08220v2)

Published 13 Mar 2024 in math.NA, cs.LG, cs.NA, stat.CO, and stat.ML

Abstract: We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse problems (BIPs). While geometric MCMC employs high-quality proposals that adapt to posterior local geometry, it requires repeated computations of gradients and Hessians of the log-likelihood, which becomes prohibitive when the parameter-to-observable (PtO) map is defined through expensive-to-solve parametric partial differential equations (PDEs). We consider a delayed-acceptance geometric MCMC method driven by a neural operator surrogate of the PtO map, where the proposal exploits fast surrogate predictions of the log-likelihood and, simultaneously, its gradient and Hessian. To achieve a substantial speedup, the surrogate must accurately approximate the PtO map and its Jacobian, which often demands a prohibitively large number of PtO map samples via conventional operator learning methods. In this work, we present an extension of derivative-informed operator learning [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)] that uses joint samples of the PtO map and its Jacobian. This leads to derivative-informed neural operator (DINO) surrogates that accurately predict the observables and posterior local geometry at a significantly lower training cost than conventional methods. Cost and error analysis for reduced basis DINO surrogates are provided. Numerical studies demonstrate that DINO-driven MCMC generates effective posterior samples 3--9 times faster than geometric MCMC and 60--97 times faster than prior geometry-based MCMC. Furthermore, the training cost of DINO surrogates breaks even compared to geometric MCMC after just 10--25 effective posterior samples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Predictive computational science: Computer predictions in the presence of uncertainty, in: Encyclopedia of Computational Mechanics, John Wiley & Sons, Ltd, 2nd edition, 2017, pp. 1–26.
  2. D. P. Kouri, A. Shapiro, Optimization of pdes with uncertain inputs, Frontiers in PDE-constrained optimization (2018) 41–81.
  3. O. Ghattas, K. Willcox, Learning physics-based models from data: Perspectives from inverse problems and model reduction, Acta Numerica 30 (2021) 445–554.
  4. A. Alexanderian, Optimal experimental design for infinite-dimensional bayesian inverse problems governed by pdes: a review, Inverse Problems 37 (2021) 043001.
  5. Equation of state calculations by fast computing machines, The Journal of Chemical Physics 21 (1953) 1087–1092.
  6. W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika 57 (1970) 97–109.
  7. G. O. Roberts, J. S. Rosenthal, General state space Markov chains and MCMC algorithms, Probability Surveys 1 (2004) 20–71.
  8. M. Girolami, B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (2011) 123–214.
  9. A stochastic Newton MCMC method for large-scale statistical inverse problems with application to seismic inversion, SIAM Journal on Scientific Computing 34 (2012) A1460–A1487.
  10. K. Law, Proposals which speed up function-space MCMC, Journal of Computational and Applied Mathematics 262 (2014) 127–138. Selected Papers from NUMDIFF-13.
  11. T. Bui-Thanh, M. Girolami, Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo, Inverse Problems 30 (2014) 114014.
  12. Emulation of higher-order tensors in manifold Monte Carlo methods for Bayesian inverse problems, Journal of Computational Physics 308 (2016) 81–101.
  13. Geometric MCMC for infinite-dimensional inverse problems, Journal of Computational Physics 335 (2017) 327–351.
  14. S. Lan, Adaptive dimension reduction to accelerate infinite-dimensional geometric Markov chain Monte Carlo, Journal of Computational Physics 392 (2019) 71–95.
  15. J. A. Christen, C. Fox, Markov chain Monte Carlo using an approximation, Journal of Computational and Graphical Statistics 14 (2005) 795–810.
  16. Preconditioning Markov chain Monte Carlo simulations using coarse-scale models, SIAM Journal on Scientific Computing 28 (2006) 776–803.
  17. Multilevel delayed acceptance MCMC, SIAM/ASA Journal on Uncertainty Quantification 11 (2023) 1–30.
  18. Neural operator: Learning maps between function spaces with applications to PDEs, Journal of Machine Learning Research 24 (2023) 1–97.
  19. Operator learning: Algorithms and analysis, arXiv preprint, arXiv.2402.15715 (2024).
  20. Efficient PDE-constrained optimization under high-dimensional uncertainty using derivative-informed neural operators, arXiv preprint arXiv.2305.20053 (2023).
  21. Derivative-Informed Neural Operator: An efficient framework for high-dimensional parametric derivative learning, Journal of Computational Physics 496 (2024) 112555.
  22. MCMC methods for functions: Modifying old algorithms to make them faster, Statistical Science 28 (2013) 424–446.
  23. J. Hesthaven, S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, Journal of Computational Physics 363 (2018) 55–78.
  24. S. Fresca, A. Manzoni, POD-DL-ROM: Enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition, Computer Methods in Applied Mechanics and Engineering 388 (2022) 114181.
  25. Model reduction and neural network for parametric PDEs, The SMAI Journal of computational mathematics 7 (2021) 121–157.
  26. Learning high-dimensional parametric maps via reduced basis adaptive residual networks, Computer Methods in Applied Mechanics and Engineering 402 (2022a) 115730.
  27. Derivative-informed projected neural networks for high-dimensional parametric maps governed by PDEs, Computer Methods in Applied Mechanics and Engineering 388 (2022b) 114199.
  28. DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators, Nature Machine Intelligence (2021).
  29. Variational autoencoding neural operators, arXiv preprint, arXiv.2302.10351 (2023).
  30. Fourier neural operator for parametric partial differential equations, arXiv preprint, arXiv.2010.08895 (2021).
  31. LNO: Laplace neural operator for solving differential equations, arXiv preprint, arXiv.2303.10528 (2023).
  32. The nonlocal neural operator: Universal approximation, arXiv preprint, arXiv.2304.13221 (2023).
  33. Neural operator: Graph kernel network for partial differential equations, arXiv preprint, arXiv.2003.03485 (2020a).
  34. Multipole graph neural operator for parametric partial differential equations, in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020b, pp. 6755–6766.
  35. The cost-accuracy trade-off in operator learning with neural networks, arXiv preprint, arXiv:2203.13181 (2022).
  36. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data, Computer Methods in Applied Mechanics and Engineering 393 (2022) 114778.
  37. Physics-informed neural operator for learning partial differential equations, ACM / IMS Journal of Data Science (2024).
  38. Learning the solution operator of parametric partial differential equations with physics-informed DeepONets, Science Advances 7 (2021) eabi8605.
  39. J. Go, P. Chen, Accelerating Bayesian optimal experimental design with derivative-informed neural operators, arXiv preprint arXiv:2312.14810 (2023).
  40. Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions, The Annals of Applied Probability 24 (2014) 2455–2490.
  41. Dimension-independent likelihood-informed MCMC, Journal of Computational Physics 304 (2016) 109–137.
  42. T. Bui-Thanh, Q. P. Nguyen, FEM-based discretization-invariant MCMC methods for PDE-constrained Bayesian inverse problems, Inverse Problems and Imaging 10 (2016) 943–975.
  43. D. Rudolf, B. Sprungk, On a generalization of the preconditioned Crank–Nicolson Metropolis algorithm, Foundations of Computational Mathematics 18 (2018) 309–343.
  44. A computational framework for infinite-dimensional Bayesian inverse problems, Part II: Stochastic Newton MCMC with application to ice sheet flow inverse problems, SIAM Journal on Scientific Computing 36 (2014) A1525–A1555.
  45. Algorithms for Kullback–Leibler approximation of probability measures in infinite dimensions, SIAM Journal on Scientific Computing 37 (2015) A2733–A2757.
  46. Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM Journal on Scientific Computing 36 (2014) A1500–A1524.
  47. Accelerating Markov chain Monte Carlo with active subspaces, SIAM Journal on Scientific Computing 38 (2016) A2779–A2805.
  48. hIPPYlib-MUQ: A Bayesian inference software framework for integration of data with complex predictive models under uncertainty, ACM Trans. Math. Softw. 49 (2023).
  49. Accelerating asymptotically exact MCMC for computationally intensive models via local approximations, Journal of the American Statistical Association 111 (2016) 1591–1607.
  50. B. Peherstorfer, Y. Marzouk, A transport-based multifidelity preconditioner for Markov chain Monte Carlo, Advances in Computational Mathematics 45 (2019) 2321–2348.
  51. Complexity analysis of accelerated MCMC methods for Bayesian inversion, Inverse Problems 29 (2013) 085010.
  52. Multilevel sequential2 Monte Carlo for Bayesian inverse problems, Journal of Computational Physics 368 (2018) 154–178.
  53. Multilevel Markov chain Monte Carlo, SIAM Review 61 (2019) 509–545.
  54. Multilevel dimension-independent likelihood-informed MCMC for large-scale inverse problems, Inverse Problems 40 (2024) 035005.
  55. A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica 19 (2010) 451–459.
  56. L. Tierney, A note on Metropolis–Hastings kernels for general state spaces, The Annals of Applied Probability 8 (1998) 1–9.
  57. Gradient-based data and parameter dimension reduction for Bayesian models: An information theoretic perspective, arXiv preprint, arXiv.2207.08670 (2022).
  58. Gradient-based dimension reduction of multivariate vector-valued functions, SIAM Journal on Scientific Computing 42 (2020) A534–A558.
  59. T. Cui, O. Zahm, Data-free likelihood-informed dimension reduction of Bayesian inverse problems, Inverse Problems 37 (2021) 045009.
  60. A computational framework for infinite-dimensional Bayesian inverse problems part I: The linearized case, with application to global seismic inversion, SIAM Journal on Scientific Computing 35 (2013) A2494–A2523.
  61. hIPPYlib: An extensible software framework for large-scale inverse problems governed by PDEs: Part I: Deterministic inversion and linearized Bayesian inference, ACM Transactions on Mathematical Software 47 (2021).
  62. A survey of direct methods for sparse linear systems, Acta Numerica 25 (2016) 383–566.
  63. Residual-based error correction for neural operator accelerated infinite-dimensional Bayesian inverse problems, Journal of Computational Physics 486 (2023) 112104.
  64. C. Schwab, J. Zech, Deep learning in high dimension: Neural network expression rates for analytic functions in 𝑳𝟐⁢(ℝ𝒅,𝜸⁢_⁢𝒅)superscript𝑳2superscriptℝ𝒅𝜸bold-_𝒅\boldsymbol{L^{2}(\mathbb{R}^{d},\gamma\_d)}bold_italic_L start_POSTSUPERSCRIPT bold_2 end_POSTSUPERSCRIPT bold_( blackboard_bold_R start_POSTSUPERSCRIPT bold_italic_d end_POSTSUPERSCRIPT bold_, bold_italic_γ bold__ bold_italic_d bold_), SIAM/ASA Journal on Uncertainty Quantification 11 (2023) 199–234.
  65. Data-driven model reduction for the Bayesian solution of inverse problems, International Journal for Numerical Methods in Engineering 102 (2015) 966–990.
  66. P. Whittle, On stationary processes in the plane, Biometrika 41 (1954) 434–449.
  67. MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Problems 29 (2013) 095017.
  68. S. P. Brooks, A. Gelman, General methods for monitoring convergence of iterative simulations, Journal of Computational and Graphical Statistics 7 (1998) 434–455.
  69. D. Dowson, B. Landau, The Fréchet distance between multivariate normal distributions, Journal of Multivariate Analysis 12 (1982) 450–455.
  70. The FEniCS project version 1.5, Archive of Numerical Software 3 (2015).
  71. hIPPYlib: An extensible software framework for large-scale inverse problems, The Journal of Open Source Software 3 (2018) 940.
  72. Multifidelity dimension reduction via active subspaces, SIAM Journal on Scientific Computing 42 (2020) A929–A956.

Summary

We haven't generated a summary for this paper yet.