GP+: A Python Library for Kernel-based learning via Gaussian Processes (2312.07694v2)
Abstract: In this paper we introduce GP+, an open-source library for kernel-based learning via Gaussian processes (GPs) which are powerful statistical models that are completely characterized by their parametric covariance and mean functions. GP+ is built on PyTorch and provides a user-friendly and object-oriented tool for probabilistic learning and inference. As we demonstrate with a host of examples, GP+ has a few unique advantages over other GP modeling libraries. We achieve these advantages primarily by integrating nonlinear manifold learning techniques with GPs' covariance and mean functions. As part of introducing GP+, in this paper we also make methodological contributions that (1) enable probabilistic data fusion and inverse parameter estimation, and (2) equip GPs with parsimonious parametric mean functions which span mixed feature spaces that have both categorical and quantitative variables. We demonstrate the impact of these contributions in the context of Bayesian optimization, multi-fidelity modeling, sensitivity analysis, and calibration of computer models.
- “Adaptive Strategies for Materials Design using Uncertainties” In Sci Rep 6, 2016, pp. 19660 DOI: 10.1038/srep19660
- “Multi-fidelity machine-learning with uncertainty quantification and Bayesian optimization for materials design: Application to ternary random alloys” In The Journal of Chemical Physics 153.7, 2020, pp. 074705 DOI: 10.1063/5.0015672
- “Multi-fidelity cost-aware Bayesian optimization” In Computer Methods in Applied Mechanics and Engineering 407, 2023, pp. 115937 DOI: https://doi.org/10.1016/j.cma.2023.115937
- “On-the-fly closed-loop materials discovery via Bayesian active learning” In Nat Commun 11.1, 2020, pp. 5966 DOI: 10.1038/s41467-020-19597-w
- Y. Zhang, D.W. Apley and W. Chen “Bayesian Optimization for Materials Design with Mixed Quantitative and Qualitative Variables” In Sci Rep 10.1, 2020, pp. 4924 DOI: 10.1038/s41598-020-60652-9
- “BoTorch: A framework for efficient Monte-Carlo Bayesian optimization” In Advances in Neural Information Processing Systems 33, 2020
- URL: https://proceedings.mlr.press/v97/astudillo19a.html
- URL: https://proceedings.mlr.press/v115/wu20a.html
- Henry C Herbol, Matthias Poloczek and Paulette Clancy “Cost-effective materials discovery: Bayesian optimization across multiple information sources” In Materials Horizons 7.8, 2020, pp. 2113–2123
- Yifan Wang, Tai-Ying Chen and Dionisios G Vlachos “NEXTorch: a design and Bayesian optimization toolkit for chemical sciences and engineering” In Journal of Chemical Information and Modeling 61.11, 2021, pp. 5312–5319
- “Multi-fidelity Bayesian optimization with max-value entropy search and its parallelization” In International Conference on Machine Learning PMLR, pp. 9334–9345
- Carl Edward Rasmussen “Gaussian processes for machine learning”, 2006
- “Kernel methods are competitive for operator learning” In arXiv preprint arXiv:2304.13202, 2023
- “Solving and learning nonlinear PDEs with Gaussian processes” In Journal of Computational Physics 447, 2021, pp. 110668
- “Sparse Gaussian processes for solving nonlinear PDEs” In Journal of Computational Physics 490, 2023, pp. 112340
- “Latent map Gaussian processes for mixed variable metamodeling” In Computer Methods in Applied Mechanics and Engineering 387, 2021, pp. 114128 DOI: ARTN 114128 10.1016/j.cma.2021.114128
- R. Planas, N. Oune and R. Bostanabad “Evolutionary Gaussian Processes” In Journal of Mechanical Design 143.11, 2021, pp. 111703 DOI: Artn 111703 10.1115/1.4050746
- “Improving identifiability in model calibration using multiple responses” In Journal of Mechanical Design 134.10, 2012, pp. 100909
- Paul D. Arendt, Daniel W. Apley and Wei Chen “Quantification of Model Uncertainty: Calibration, Model Discrepancy, and Identifiability” In Journal of Mechanical Design 134.10, 2012, pp. 100908 DOI: 10.1115/1.4007390
- J Loeppky, Derek Bingham and W Welch “Computer model calibration or tuning in practice” In Technometrics, submitted for publication, 2006
- M.J. Bayarri, J.O. Berger and F. Liu “Modularization in Bayesian analysis, with emphasis on analysis of computer models” In Bayesian Analysis 4.1, 2009, pp. 119–150 DOI: 10.1214/09-ba404
- Marc C Kennedy and Anthony O’Hagan “Bayesian calibration of computer models” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63.3, 2001, pp. 425–464
- Ralph C Smith “Uncertainty quantification: theory, implementation, and applications” Siam, 2013
- “Review of multi-fidelity models” In arXiv preprint arXiv:1609.07196, 2016
- “Deep gaussian processes for multi-fidelity modeling” In arXiv preprint arXiv:1903.07320, 2019
- Jonathan Tammer Eweis-Labolle, Nicholas Oune and Ramin Bostanabad “Data Fusion With Latent Map Gaussian Processes” In Journal of Mechanical Design 144.9, 2022 DOI: 10.1115/1.4054520
- “Data-Driven Calibration of Multifidelity Multiscale Fracture Models Via Latent Map Gaussian Process” In Journal of Mechanical Design 145.1, 2023, pp. 1–15 DOI: 10.1115/1.4055951
- “A numerical Bayesian-calibrated characterization method for multiscale prepreg preforming simulations with tension-shear coupling” In Composites Science and Technology 170, 2019, pp. 15–24 DOI: 10.1016/j.compscitech.2018.11.019
- “GPflow: A Gaussian Process Library using TensorFlow” In J. Mach. Learn. Res. 18.40, 2017, pp. 1–6
- “Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration” In arXiv preprint arXiv:1809.11165, 2018
- “High dimensional Kriging metamodelling utilising gradient information” In Applied Mathematical Modelling 40.9-10, 2016, pp. 5256–5270 DOI: 10.1016/j.apm.2015.12.033
- Mohamed Amine Bouhlel and Joaquim RRA Martins “Gradient-enhanced kriging for high-dimensional problems” In arXiv preprint arXiv:1708.02663, 2017
- “High-Dimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps” In Mathematical Geosciences 50.1, 2017, pp. 77–96 DOI: 10.1007/s11004-017-9705-y
- Rohit Tripathy, Ilias Bilionis and Marcial Gonzalez “Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation” In Journal of Computational Physics 321, 2016, pp. 191–223
- Dimitris G. Giovanis and Michael D. Shields “Data-driven surrogates for high dimensional models using Gaussian process regression on the Grassmann manifold” In Computer Methods in Applied Mechanics and Engineering 370, 2020, pp. 113269
- “Deep gaussian processes” In Artificial Intelligence and Statistics, pp. 207–215
- James Hensman, Nicolo Fusi and Neil D Lawrence “Gaussian processes for big data” In arXiv preprint arXiv:1309.6835, 2013
- Robert B. Gramacy and Daniel W. Apley “Local Gaussian Process Approximation for Large Computer Experiments” In Journal of Computational and Graphical Statistics 24.2, 2015, pp. 561–578 DOI: 10.1080/10618600.2014.914442
- “Meta-Kriging: Scalable Bayesian Modeling and Inference for Massive Spatial Datasets” In Technometrics 60.4, 2018, pp. 430–444 DOI: 10.1080/00401706.2018.1437474
- “Patchwork Kriging for Large-scale Gaussian Process Regression” In Journal of Machine Learning Research 19.1, 2018, pp. 269–311 URL: %3CGo%20to%20ISI%3E://WOS:000443222900001
- “When Gaussian Process Meets Big Data: A Review of Scalable GPs” In IEEE Transactions on Neural Networks and Learning Systems 31.11, 2020, pp. 4405–4423 DOI: 10.1109/TNNLS.2019.2957109
- URL: http://proceedings.mlr.press/v130/stanton21a.html
- “Scalable Gaussian Processes for Data-Driven Design Using Big Data With Categorical Factors” In Journal of Mechanical Design 144.2, 2021 DOI: 10.1115/1.4052221
- Robert B. Gramacy and Herbert K.H. Lee “Bayesian Treed Gaussian Process Models With an Application to Computer Modeling” In Journal of the American Statistical Association 103.483, 2012, pp. 1119–1130 DOI: 10.1198/016214508000000689
- “A Latent Variable Approach to Gaussian Process Modeling with Qualitative and Quantitative Factors” In Technometrics 62.3, 2019, pp. 291–302 DOI: 10.1080/00401706.2019.1638834
- “Mixed-input Gaussian process emulators for computer experiments with a large number of categorical levels” In Journal of Quality Technology, 2020, pp. 1–11 DOI: 10.1080/00224065.2020.1778431
- “Group kernels for Gaussian process metamodels with categorical inputs” In SIAM/ASA Journal on Uncertainty Quantification 8.2, 2020, pp. 775–806
- Peter Z.G. Qian, Huaiqing Wu and C.F.Jeff Wu “Gaussian process models for computer experiments with qualitative and quantitative factors” In Technometrics 50.3, 2008, pp. 383–396
- Hossein Mobahi and John W Fisher III “A theoretical analysis of optimization by Gaussian continuation” In Twenty-Ninth AAAI Conference on Artificial Intelligence
- Edwin V Bonilla, Kian Chai and Christopher Williams “Multi-task Gaussian process prediction” In Advances in neural information processing systems 20, 2007
- “Gaussian process emulation of dynamic computer codes” In Biometrika 96.3, 2009, pp. 663–676 DOI: 10.1093/biomet/asp028
- “Bayesian emulation of complex multi-output and dynamic computer models” In Journal of Statistical Planning and Inference 140.3, 2010, pp. 640–651 DOI: 10.1016/j.jspi.2009.08.006
- “Regression and classification using Gaussian process priors” In Bayesian statistics 6, 1998, pp. 475
- David JC MacKay “Introduction to Gaussian processes” In NATO ASI series F computer and systems sciences 168, 1998, pp. 133–166
- Robert B. Gramacy and Herbert K.H. Lee “Cases for the nugget in modeling computer experiments” In Statistics and Computing 22.3, 2010, pp. 713–722 DOI: 10.1007/s11222-010-9224-x
- “Leveraging the nugget parameter for efficient Gaussian process modeling” In International journal for numerical methods in engineering 114.5 Wiley Online Library, 2018, pp. 501–516
- Blake MacDonald, Pritam Ranjan and Hugh Chipman “GPfit: An R package for fitting a Gaussian process model to deterministic simulator outputs” In Journal of Statistical Software 64, 2015, pp. 1–23
- A. O’Hagan “Curve Fitting and Optimal Design for Prediction” In Journal of the Royal Statistical Society: Series B (Methodological) 40.1, 1978, pp. 1–24 DOI: https://doi.org/10.1111/j.2517-6161.1978.tb01643.x
- Robert B Gramacy “tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models” In Journal of Statistical Software 19, 2007, pp. 1–46
- “Bayesian CART model search. Commentaries. Authors’ reply” In Journal of the American Statistical Association 93.443, 1998, pp. 935–960
- “Mixtures of Gaussian Process Experts with SMC2𝑆𝑀superscript𝐶2SMC^{2}italic_S italic_M italic_C start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT” In arXiv preprint arXiv:2208.12830, 2022
- “Treed-gaussian processes with support vector machines as nodes for nonstationary Bayesian optimization” In 2021 Winter Simulation Conference (WSC), 2021, pp. 1–12 IEEE
- GPy “GPy: A Gaussian process framework in python”, http://github.com/SheffieldML/GPy, since 2012
- Adriano Azevedo-Filho and Ross D Shachter “Laplace’s method approximations for probabilistic inferencein belief networks with continuous variables” In Proceedings of the Tenth international conference on Uncertainty in artificial intelligence Morgan Kaufmann Publishers Inc., pp. 28–36
- “Deep kernel learning” In Artificial intelligence and statistics PMLR, pp. 370–378
- Michalis Titsias and Neil D Lawrence “Bayesian Gaussian process latent variable model” In Proceedings of the thirteenth international conference on artificial intelligence and statistics JMLR WorkshopConference Proceedings, pp. 844–851
- Neil Lawrence “Gaussian process latent variable models for visualisation of high dimensional data” In Advances in neural information processing systems 16, 2003
- “Pyro: Deep Universal Probabilistic Programming” In Journal of Machine Learning Research, 2018
- “Fast direct methods for Gaussian processes and the analysis of NASA Kepler mission data” In arXiv preprint arXiv:1403.6015, 2014
- “GPstuff: Bayesian modeling with Gaussian processes” In The Journal of Machine Learning Research 14.1, 2013, pp. 1175–1179
- “Latent map Gaussian processes for mixed variable metamodeling” In Computer Methods in Applied Mechanics and Engineering 387 Elsevier, 2021, pp. 114128
- “Latent variable Gaussian process models: A rank-based analysis and an alternative approach” In International Journal for Numerical Methods in Engineering 122.15 Wiley Online Library, 2021, pp. 4007–4026
- Matthias Poloczek, Jialei Wang and Peter Frazier “Multi-information source optimization” In Advances in neural information processing systems 30, 2017
- “A surrogate based multi-fidelity approach for robust design optimization” In Applied Mathematical Modelling 47 Elsevier, 2017, pp. 726–744
- “Multi-fidelity design optimisation strategy under uncertainty with limited computational budget” In Optimization and Engineering 22 Springer, 2021, pp. 1039–1064
- “Covariance Expressions for Multi-Fidelity Sampling with Multi-Output, Multi-Statistic Estimators: Application to Approximate Control Variates” In arXiv preprint arXiv:2310.00125, 2023
- Ghina N Absi and Sankaran Mahadevan “Multi-fidelity approach to dynamics model calibration” In Mechanical Systems and Signal Processing 68 Elsevier, 2016, pp. 189–206
- Il’ya Meerovich Sobol’ “On sensitivity estimation for nonlinear mathematical models” In Matematicheskoe Modelirovanie 2.1, 1990, pp. 112–118
- A.A. Gorodetsky, J.D. Jakeman and G. Geraci “MFNets: data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources” In Computational Mechanics 68.4, 2021, pp. 741–758 DOI: 10.1007/s00466-021-02042-0
- “Probabilistic neural data fusion for learning from an arbitrary number of multi-fidelity data sets” In Computer Methods in Applied Mechanics and Engineering 415, 2023, pp. 116207 DOI: https://doi.org/10.1016/j.cma.2023.116207
- “Prediction based on the Kennedy-O’Hagan calibration model: asymptotic consistency and other properties” In arXiv preprint arXiv:1703.01326, 2017
- Peter Z.G. Qian and C.F.Jeff Wu “Bayesian Hierarchical Modeling for Integrating Low-Accuracy and High-Accuracy Experiments” In Technometrics 50.2, 2008, pp. 192–204 DOI: 10.1198/004017008000000082
- “Calibration and Uncertainty Analysis for Computer Simulations with Multivariate Output” In AIAA Journal 46.5, 2008, pp. 1253–1265 DOI: 10.2514/1.35288
- “Computer model validation with functional output” In The Annals of Statistics 35.5, 2007, pp. 1874–1906 DOI: 10.1214/009053607000000163
- “Combining Field Data and Computer Simulations for Calibration and Prediction” In SIAM Journal on Scientific Computing 26.2, 2004, pp. 448–466 DOI: 10.1137/s1064827503426693
- “Safeguarding Multi-fidelity Bayesian Optimization Against Large Model Form Errors and Heterogeneous Noise” In Journal of Mechanical Design, 2023, pp. 1–23
- Diederik P Kingma and Max Welling “Auto-encoding variational bayes” In arXiv preprint arXiv:1312.6114, 2013
- Robert L Wolpert “Conditional expectation” In University Lecture, 2010
- Matthew R Rudary “On predictive linear gaussian models” University of Michigan, 2009
- “Deep neural networks as gaussian processes” In arXiv preprint arXiv:1711.00165, 2017
- “Learning scalable deep kernels with recurrent structure” In The Journal of Machine Learning Research 18.1 JMLR. org, 2017, pp. 2850–2886
- Robert Planas, Nick Oune and Ramin Bostanabad “Evolutionary gaussian processes” In Journal of Mechanical Design 143.11 American Society of Mechanical Engineers, 2021, pp. 111703
- “Nonlinear finite elements for continua and structures” John wiley &\&& sons, 2013
- “An integrated computational materials engineering method for woven carbon fiber composites preforming process” In AIP Conference Proceedings 1769.1, 2016, pp. 170036 DOI: 10.1063/1.4963592
- “Mechanical behavior of carbon fiber reinforced polyamide composites” In Composites Science and Technology 63.13, 2003, pp. 1843–1855 DOI: https://doi.org/10.1016/S0266-3538(03)00119-2
- “Predictive multiscale modeling for Unidirectional Carbon Fiber Reinforced Polymers” In Composites Science and Technology 186, 2020, pp. 107922 DOI: 10.1016/j.compscitech.2019.107922
- “Reduced-order multiscale modeling of plastic deformations in 3D alloys with spatially varying porosity by deflated clustering analysis” In Computational Mechanics, 2022, pp. 1–32 DOI: 10.1007/s00466-022-02177-8
- Shiguang Deng, Diran Apelian and Ramin Bostanabad “Adaptive spatiotemporal dimension reduction in concurrent multiscale damage analysis” In Computational Mechanics, 2023 DOI: 10.1007/s00466-023-02299-7
- George J Dvorak “Transformation field analysis of inelastic composite materials” In Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences 437.1900 The Royal Society London, 1992, pp. 311–327
- Sophie Roussette, Jean-Claude Michel and Pierre Suquet “Nonuniform transformation field analysis of elastic–viscoplastic composites” In Composites Science and Technology 69.1 Elsevier, 2009, pp. 22–27
- Carlos M Carvalho, Nicholas G Polson and James G Scott “The horseshoe estimator for sparse signals” In Biometrika 97.2 Oxford University Press, 2010, pp. 465–480
- MathWorks “Gaussian Process Regression Models”, 2023 URL: https://www.mathworks.com/help/stats/gaussian-process-regression-models.html
- “BoTorch Models”, 2020 URL: https://botorch.org/api/models.html#botorch.models.gp_regression_mixed.MixedSingleTaskGP
- “Data-Driven Calibration of Multifidelity Multiscale Fracture Models Via Latent Map Gaussian Process” In Journal of Mechanical Design 145.1 American Society of Mechanical Engineers, 2023, pp. 011705
- “UQLab: A framework for uncertainty quantification in Matlab” In Vulnerability, uncertainty, and risk: quantification, mitigation, and management, 2014, pp. 2554–2563
- “A Framework for Validation of Computer Models” In Technometrics 49.2, 2007, pp. 138–154 DOI: 10.1198/004017007000000092
- “Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020” In NeurIPS 2020 Competition and Demonstration Track, 2021, pp. 3–26 PMLR
- Peter I Frazier and Jialei Wang “Bayesian optimization for materials design” In Information science for materials discovery and design Springer, 2015, pp. 45–75
- Ivo Couckuyt, Sebastian Rojas Gonzalez and Juergen Branke “Bayesian optimization: tutorial” In Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2022, pp. 843–863
- Loc Nguyen “Tutorial on Bayesian optimization” Preprints, 2023
- Eric Brochu, Vlad M Cora and Nando De Freitas “A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning” In arXiv preprint arXiv:1012.2599, 2010
- “Multi-fidelity cost-aware Bayesian optimization” In Computer Methods in Applied Mechanics and Engineering 407 Elsevier, 2023, pp. 115937
- “Gaussian process regression tuned by bayesian optimization for seawater intrusion prediction” In Computational intelligence and neuroscience 2019 Hindawi, 2019
- “A survey on high-dimensional Gaussian process modeling with application to Bayesian optimization” In ACM Transactions on Evolutionary Learning and Optimization 2.2 ACM New York, NY, 2022, pp. 1–26
- “Stagewise safe bayesian optimization with gaussian processes” In International conference on machine learning, 2018, pp. 4781–4789 PMLR
- “Applying Bayesian optimization with Gaussian process regression to computational fluid dynamics problems” In Journal of Computational Physics 449 Elsevier, 2022, pp. 110788
- Mitchell McIntire, Daniel Ratner and Stefano Ermon “Sparse Gaussian Processes for Bayesian Optimization.” In UAI, 2016
- “High dimensional Bayesian optimization with elastic Gaussian process” In International conference on machine learning, 2017, pp. 2883–2891 PMLR
- David A Egger, Andrew M Rappe and Leeor Kronik “Hybrid organic–inorganic perovskites on the move” In Accounts of chemical research 49.3 ACS Publications, 2016, pp. 573–581
- “Machine learning–enabled high-entropy alloy discovery” In Science 378.6615 American Association for the Advancement of Science, 2022, pp. 78–85
- “A comprehensive survey of M2AX phase elastic properties” In Journal of Physics: Condensed Matter 21.30 IOP Publishing, 2009, pp. 305403
- “Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index” In Computer physics communications 181.2 Elsevier, 2010, pp. 259–270
- S Ashwin Renganathan, Vishwas Rao and Ionel M Navon “CAMERA: A Method for Cost-aware, Adaptive, Multifidelity, Efficient Reliability Analysis” In arXiv preprint arXiv:2203.01436, 2022
- “Evaluating epidemic forecasts in an interval format” In PLoS computational biology 17.2 Public Library of Science San Francisco, CA USA, 2021, pp. e1008618
- “Proper scoring rules for interval probabilistic forecasts” In Quarterly Journal of the Royal Meteorological Society 143.704 Wiley Online Library, 2017, pp. 1597–1607
- Peter I Frazier, Warren B Powell and Savas Dayanik “A knowledge-gradient policy for sequential information collection” In SIAM Journal on Control and Optimization 47.5 SIAM, 2008, pp. 2410–2439
- “BoTorch: a framework for efficient Monte-Carlo Bayesian optimization” In Advances in neural information processing systems 33, 2020, pp. 21524–21538