Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Universal Functional Regression with Neural Operator Flows (2404.02986v3)

Published 3 Apr 2024 in cs.LG and stat.ML

Abstract: Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing flows. OpFlow is an invertible operator that maps the (potentially unknown) data function space into a Gaussian process, allowing for exact likelihood estimation of functional point evaluations. OpFlow enables robust and accurate uncertainty quantification via drawing posterior samples of the Gaussian process and subsequently mapping them into the data function space. We empirically study the performance of OpFlow on regression and generation tasks with data generated from Gaussian processes with known posterior forms and non-Gaussian processes, as well as real-world earthquake seismograms with an unknown closed-form distribution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, Cambridge, Mass, 2006. ISBN 978-0-262-18253-9. OCLC: ocm61285753.
  2. Geostatistics: modeling spatial uncertainty, volume 713. John Wiley & Sons, 2012. ISBN 0-470-18315-2.
  3. Gaussian Process Regression for Materials and Molecules. Chemical Reviews, 121(16):10073–10141, August 2021. ISSN 0009-2665. doi: 10.1021/acs.chemrev.1c00022. URL https://doi.org/10.1021/acs.chemrev.1c00022. Publisher: American Chemical Society.
  4. Gaussian Process Regression for Astronomical Time Series. Annual Review of Astronomy and Astrophysics, 61(1):329–371, 2023. doi: 10.1146/annurev-astro-052920-103508. URL https://doi.org/10.1146/annurev-astro-052920-103508. _eprint: https://doi.org/10.1146/annurev-astro-052920-103508.
  5. Generative Adversarial Neural Operators, October 2022a. URL http://arxiv.org/abs/2205.03017. arXiv:2205.03017 [cs, math].
  6. Score-based Diffusion Models in Function Space, November 2023. URL http://arxiv.org/abs/2302.07400. arXiv:2302.07400 [cs, math, stat].
  7. Variational Autoencoding Neural Operators, February 2023. URL http://arxiv.org/abs/2302.10351. arXiv:2302.10351 [cs, stat].
  8. Diffusion Generative Models in Infinite Dimensions, February 2023. URL http://arxiv.org/abs/2212.00886. arXiv:2212.00886 [cs, stat].
  9. Infinite-Dimensional Diffusion Models, February 2023. URL https://arxiv.org/abs/2302.10130v2.
  10. Conditional score-based diffusion models for Bayesian inference in infinite dimensions. Advances in Neural Information Processing Systems, 36:24262–24290, December 2023. URL https://proceedings.neurips.cc/paper_files/paper/2023/hash/4c79c359b3c5f077c0b955f93cb0f53e-Abstract-Conference.html.
  11. Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation, March 2023. URL https://arxiv.org/abs/2303.04772v3.
  12. Bayesian learning via stochastic gradient Langevin dynamics. pages 681–688. Citeseer, 2011.
  13. Neural Operator: Learning Maps Between Function Spaces, April 2023. URL http://arxiv.org/abs/2108.08481. arXiv:2108.08481 [cs, math].
  14. Neural Operator: Graph Kernel Network for Partial Differential Equations, March 2020. URL http://arxiv.org/abs/2003.03485. arXiv:2003.03485 [cs, math, stat].
  15. Fourier Neural Operator for Parametric Partial Differential Equations, May 2021. URL http://arxiv.org/abs/2010.08895. arXiv:2010.08895 [cs, math].
  16. Neural Operators for Accelerating Scientific Simulations and Design, January 2024. URL http://arxiv.org/abs/2309.15325. arXiv:2309.15325 [physics].
  17. Neural Operators with Localized Integral and Differential Kernels, February 2024. URL http://arxiv.org/abs/2402.16845. arXiv:2402.16845 [cs, math] version: 1.
  18. Generative Adversarial Networks, June 2014. URL http://arxiv.org/abs/1406.2661. arXiv:1406.2661 [cs, stat].
  19. SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS. 2021.
  20. Auto-Encoding Variational Bayes, December 2022. URL http://arxiv.org/abs/1312.6114. arXiv:1312.6114 [cs, stat].
  21. Development of Synthetic Ground-Motion Records through Generative Adversarial Neural Operators. pages 105–113, February 2024a. doi: 10.1061/9780784485316.012. URL https://ascelibrary.org/doi/10.1061/9780784485316.012. Publisher: American Society of Civil Engineers.
  22. Broadband Ground-Motion Synthesis via Generative Adversarial Neural Operators: Development and Validation. Bulletin of the Seismological Society of America, March 2024b. ISSN 0037-1106. doi: 10.1785/0120230207. URL https://doi.org/10.1785/0120230207.
  23. Seismic Wave Propagation and Inversion with Neural Operators. The Seismic Record, 1(3):126–134, November 2021. ISSN 2694-4006. doi: 10.1785/0320210026. URL https://doi.org/10.1785/0320210026.
  24. Neural Encoding and Decoding With a Flow-Based Invertible Generative Model. IEEE Transactions on Cognitive and Developmental Systems, 15(2):724–736, June 2023. ISSN 2379-8920, 2379-8939. doi: 10.1109/TCDS.2022.3176977. URL https://ieeexplore.ieee.org/document/9780264/.
  25. Globally injective and bijective neural operators, June 2023. URL http://arxiv.org/abs/2306.03982. arXiv:2306.03982 [cs, stat].
  26. Neural Processes, July 2018. URL http://arxiv.org/abs/1807.01622. arXiv:1807.01622 [cs, stat].
  27. Neural Diffusion Processes, June 2022. URL http://arxiv.org/abs/2206.03992. arXiv:2206.03992 [cs, stat].
  28. Generative Models as Distributions of Functions, February 2022. URL http://arxiv.org/abs/2102.04776. arXiv:2102.04776 [cs, stat].
  29. Glow: Generative Flow with Invertible 1x1 Convolutions, July 2018. URL http://arxiv.org/abs/1807.03039. arXiv:1807.03039 [cs, stat].
  30. Density estimation using Real NVP, February 2017. URL http://arxiv.org/abs/1605.08803. arXiv:1605.08803 [cs, stat].
  31. Neural Ordinary Differential Equations, December 2019. URL http://arxiv.org/abs/1806.07366. arXiv:1806.07366 [cs, stat].
  32. Normalizing Flows: An Introduction and Review of Current Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11):3964–3979, November 2021. ISSN 0162-8828, 2160-9292, 1939-3539. doi: 10.1109/TPAMI.2020.2992934. URL http://arxiv.org/abs/1908.09257. arXiv:1908.09257 [cs, stat].
  33. WaveFlow: A Compact Flow-based Model for Raw Audio. In Proceedings of the 37th International Conference on Machine Learning, pages 7706–7716. PMLR, November 2020. URL https://proceedings.mlr.press/v119/ping20a.html. ISSN: 2640-3498.
  34. Leveraging exploration in off-policy algorithms via normalizing flows. In Proceedings of the Conference on Robot Learning, pages 430–444. PMLR, May 2020. URL https://proceedings.mlr.press/v100/mazoure20a.html. ISSN: 2640-3498.
  35. DeepGEM: Generalized Expectation-Maximization for Blind Inversion. In Advances in Neural Information Processing Systems, volume 34, pages 11592–11603. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/hash/606c90a06173d69682feb83037a68fec-Abstract.html.
  36. Composing Normalizing Flows for Inverse Problems. In Proceedings of the 38th International Conference on Machine Learning, pages 11158–11169. PMLR, July 2021. URL https://proceedings.mlr.press/v139/whang21b.html. ISSN: 2640-3498.
  37. SRFlow: Learning the Super-Resolution Space with Normalizing Flow, July 2020. URL http://arxiv.org/abs/2006.14200. arXiv:2006.14200 [cs, eess].
  38. The Bayesian Approach to Inverse Problems. In Roger Ghanem, David Higdon, and Houman Owhadi, editors, Handbook of Uncertainty Quantification, pages 311–428. Springer International Publishing, Cham, 2017. ISBN 978-3-319-12384-4 978-3-319-12385-1. doi: 10.1007/978-3-319-12385-1_7. URL http://link.springer.com/10.1007/978-3-319-12385-1_7.
  39. Bayesian Nonparametric Models. Encyclopedia of machine learning, 1:81–89, 2010.
  40. Daniel Barry. Nonparametric Bayesian Regression. The Annals of Statistics, 14(3):934–953, 1986. ISSN 0090-5364. URL https://www.jstor.org/stable/3035551. Publisher: Institute of Mathematical Statistics.
  41. Larry Wasserman. All of nonparametric statistics. Springer Science & Business Media, 2006. ISBN 0-387-30623-4.
  42. Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models, February 2020. URL http://arxiv.org/abs/2002.07101. arXiv:2002.07101 [cs, stat].
  43. A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6(1):60, July 2019. ISSN 2196-1115. doi: 10.1186/s40537-019-0197-0. URL https://doi.org/10.1186/s40537-019-0197-0.
  44. FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators, February 2022. URL http://arxiv.org/abs/2202.11214. arXiv:2202.11214 [physics].
  45. The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020. ISSN 0035-9009. Publisher: Wiley Online Library.
  46. Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/hash/7a006957be65e608e863301eb98e1808-Abstract.html.
  47. PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators, November 2022b. URL http://arxiv.org/abs/2211.16210. arXiv:2211.16210 [cs].
  48. A class of Wasserstein metrics for probability distributions. Michigan Mathematical Journal, 31(2):231–240, 1984. ISSN 0026-2285. Publisher: University of Michigan, Department of Mathematics.
  49. Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization. In Advances in Neural Information Processing Systems, volume 33, pages 22268–22281. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/hash/fcf55a303b71b84d326fb1d06e332a26-Abstract.html.
  50. Non-Gaussian Process Regression, September 2022. URL http://arxiv.org/abs/2209.03117. arXiv:2209.03117 [cs, stat].
  51. Sungjin Ahn. Stochastic gradient mcmc: Algorithms and applications. University of California, Irvine, 2015. ISBN 1-339-12430-0.
  52. Optimal Thinning of MCMC Output. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(4):1059–1081, September 2022. ISSN 1369-7412, 1467-9868. doi: 10.1111/rssb.12503. URL https://academic.oup.com/jrsssb/article/84/4/1059/7073269.
  53. A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges. Journal of Machine Learning for Modeling and Computing, 1(2):119–156, 2020. ISSN 2689-3967. doi: 10.1615/JMachLearnModelComput.2020035155. URL http://arxiv.org/abs/2006.09319. arXiv:2006.09319 [cs, math, stat].
  54. Monte Carlo statistical methods, volume 2. Springer, 1999.
  55. NIED. Strong-Motion Seismograph Networks (K-NET, KiK-Net). National Research Institute for Earth Science and Disaster Prevention, 2013.
  56. Peter M. Shearer. Introduction to Seismology. Cambridge University Press, May 2019. ISBN 978-1-107-18447-3. Google-Books-ID: 08aVDwAAQBAJ.
  57. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17(3):261–272, 2020. ISSN 1548-7105. Publisher: Nature Publishing Group.
  58. The Random Feature Model for Input-Output Maps between Banach Spaces. SIAM Journal on Scientific Computing, 43(5):A3212–A3243, January 2021. ISSN 1064-8275, 1095-7197. doi: 10.1137/20M133957X. URL http://arxiv.org/abs/2005.10224. arXiv:2005.10224 [physics, stat].
  59. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021. URL http://arxiv.org/abs/2010.11929. arXiv:2010.11929 [cs].
  60. Blending Anti-Aliasing into Vision Transformer. In Advances in Neural Information Processing Systems, volume 34, pages 5416–5429. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/2b3bf3eee2475e03885a110e9acaab61-Abstract.html.
Citations (2)

Summary

  • The paper introduces Neural Operator Flows, a novel method that extends normalizing flows to infinite-dimensional function spaces for functional regression.
  • It leverages a Bayesian framework and SGLD to accurately infer function values from sparse data, outperforming traditional Gaussian process methods.
  • Empirical tests demonstrate NOF's robustness in modeling diverse real-world data, including seismic and synthetic datasets, with precise uncertainty quantification.

Universal Functional Regression with Neural Operator Flows

Introduction

In this paper, Shi et al. introduce a novel approach to Universal Functional Regression (UFR) through the development of Neural Operator Flows (NOF). This innovative model extends the concept of normalizing flows to infinite-dimensional function spaces, presenting a robust framework for learning a prior distribution over non-Gaussian function spaces that lends itself to functional regression. The authors empirically evaluate the performance of NOF across a series of regression and generative tasks involving both synthetic and real-world data, demonstrating its capability to accurately model and infer complex function spaces that have previously been challenging to address using traditional Gaussian process-based methods.

Neural Operator Flows

NOF is conceived as a sequence of invertible layers, each capable of acting directly on function spaces. This structure permits the exact estimation of likelihood for functional point evaluations, a crucial feature for functional regression tasks. The model architecture incorporates several key components:

  • Actnorm: A normalization strategy used to stabilize the training process.
  • Domain and Codomain Partitioning: Introducing two versions of NOF based on whether the partitioning is applied to the function's domain or codomain.
  • Affine Coupling: Implements a transformation in the function space, allowing for a resolution-invariant property essential for handling different discretizations.

Training NOF involves minimizing the negative log-likelihood with an additional regularization term based on the 2-Wasserstein distance to stabilize the learning process and ensure the model learns the true probability measure. This process highlights a critical advancement in the field of functional regression by allowing for posterior estimation over entire physical domains, a capability not effectively addressed by existing models.

Universal Functional Regression with NOF

The novel contribution of NOF extends to performing UFR in a principled Bayesian framework. Utilizing the trained NOF as a learned prior, the paper demonstrates how NOF can be used to infer function values across entire domains given only sparse observations. This process entails maximizing the likelihood of observed values against the learned prior, with posterior samples drawn through Stochastic Gradient Langevin Dynamics (SGLD). The paper showcases this application across various datasets, including Gaussian processes, truncated Gaussian processes, Gaussian random fields, and real-world seismic waveform data, revealing NOF's flexibility and robustness in capturing both Gaussian and non-Gaussian process distributions.

Empirical Evaluation and Findings

The performance of NOF is empirically evaluated through regression tasks on both synthetic and real-world data. Across tasks, NOF displays remarkable accuracy in posterior estimation, effectively capturing the underlying function spaces and offering precise uncertainty quantification. These results are particularly significant in the context of non-Gaussian processes, where traditional methods fall short. For instance, in modeling earthquake seismograms, NOF outperforms Gaussian process regression, highlighting its potential in domains where data exhibit heavy-tailed or multimodal distributions.

Discussion and Future Work

NOF presents a significant advancement in learning priors over function spaces, enabling accurate functional regression and generation tasks across a broad spectrum of applications. The model's ability to handle non-Gaussian processes and provide exact likelihood estimation positions NOF as a powerful tool for extracting insights from complex dataset structures. Looking forward, the flexibility and effectiveness of NOF suggest promising avenues for further research and application, particularly in fields where understanding the underlying function spaces is critical for prediction and decision-making.