Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 114 tok/s
Gemini 3.0 Pro 53 tok/s Pro
Gemini 2.5 Flash 132 tok/s Pro
Kimi K2 176 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Function-on-Function Gaussian Process (FFGP)

Updated 18 November 2025
  • FFGP is a framework for modeling mappings between infinite-dimensional function spaces using operator-valued kernels and Hilbert space representations.
  • It employs nonparametric Bayesian regression and efficient eigendecomposition to enable accurate operator learning without discretization approximations.
  • FFGP extends classical Gaussian processes to functional inputs and outputs, offering enhanced uncertainty quantification and scalable computation for complex systems.

A function-on-function Gaussian process (FFGP) is a mathematical framework for modeling mappings where both the input and the output reside in infinite-dimensional function spaces. The FFGP formalism enables nonparametric Bayesian regression, operator learning, and uncertainty quantification in diverse fields, including functional data analysis, operator learning for partial differential equations, and Bayesian optimization in complex system design. FFGPs directly model the joint distribution over functions, enabling efficient and flexible inference without discretization or basis-expansion approximations that are traditionally required for functional inputs or outputs.

1. Definition and Mathematical Framework

An FFGP models a mapping f:XpYf: \mathcal{X}^p \rightarrow \mathcal{Y}, where Xp=X××X\mathcal{X}^p = \mathcal{X} \times \cdots \times \mathcal{X} and XL2(Ωx)\mathcal{X} \subset L^2(\Omega_x) for compact ΩxRd\Omega_x \subset \mathbb{R}^d, and Y=L2(Ωy)\mathcal{Y} = L^2(\Omega_y) for compact Ωy\Omega_y (often [0,1][0,1]). Both spaces are Hilbert spaces with L2L^2 inner product structure. The FFGP is characterized by specifying a mean function μY\mu \in \mathcal{Y} and a positive-definite operator-valued kernel K:Xp×XpL(Y)K: \mathcal{X}^p \times \mathcal{X}^p \to \mathcal{L}(\mathcal{Y}), leading to the Gaussian process prior

f()FFGP(μ,K(,)).f(\cdot) \sim \operatorname{FFGP}(\mu, K(\cdot, \cdot)).

For any finite collection of inputs x1,,xnXpx_1, \ldots, x_n \in \mathcal{X}^p, the outputs [f(xi)]i=1n[f(x_i)]_{i=1}^n jointly follow a (functional) Gaussian law in (L2)n(L^2)^n, with mean [μ,,μ][\mu, \ldots, \mu] and blockwise covariance K(xi,xj)K(x_i, x_j) (Huang et al., 16 Nov 2025).

2. Operator-Valued Kernels in FFGPs

FFGPs use operator-valued kernels to encode dependencies between function-valued inputs and outputs. The standard construction is the separable operator-valued kernel: K(x,x)=σ2kx(x,x)TY,K(x, x') = \sigma^2\, k_x(x, x')\, T_{\mathcal{Y}}, where kxk_x is a positive-definite scalar kernel on Xp×Xp\mathcal{X}^p \times \mathcal{X}^p, typically constructed via the L2L^2-distance r(x,x)=(xx)/ψxL2r(x, x') = \| (x - x') / \psi_x \|_{L^2}, and incorporating the Matérn-ν\nu kernel for smoothness control. TYL(Y)T_{\mathcal{Y}} \in \mathcal{L}(\mathcal{Y}) is a nonnegative self-adjoint operator, often a Hilbert–Schmidt integral operator with a kernel such as ky(s,t)=exp(st/ψy)k_y(s, t) = \exp(-|s-t|/\psi_y) or the Wiener kernel ky(s,t)=min(s,t)/ψyk_y(s, t) = \min(s,t)/\psi_y (Huang et al., 16 Nov 2025).

Under this construction, the covariance between f(x)(t)f(x)(t) and f(x)(t)f(x')(t') decomposes as

Cov(f(x)(t),f(x)(t))=σ2kx(x,x)ky(t,t).\mathrm{Cov}(f(x)(t), f(x')(t')) = \sigma^2 k_x(x, x') k_y(t, t').

3. Posterior Inference and Predictive Distributions

Given observations yi=f(xi)+εiy_i = f(x_i) + \varepsilon_i with εiN(0,τ2IY)\varepsilon_i \sim \mathcal{N}(0, \tau^2 I_{\mathcal{Y}}) in Y\mathcal{Y}, posterior inference exploits eigendecomposition of TYT_{\mathcal{Y}} and the Gram matrix of kxk_x. For a new input xx, the posterior mean and covariance operator are

f^(x)=μ+Kn(x)(Kn+τ2IY)1(Yn1nμ), K^(x,x)=K(x,x)Kn(x)(Kn+τ2IY)1Kn(x),\begin{aligned} \hat{f}(x) &= \mu + K_n(x)^\top \left( \mathcal{K}_n + \tau^2 I_{\mathcal{Y}} \right)^{-1} (Y_n - 1_n \mu), \ \widehat{K}(x, x) &= K(x, x) - K_n(x)^\top \left( \mathcal{K}_n + \tau^2 I_{\mathcal{Y}} \right)^{-1} K_n(x), \end{aligned}

where Kn=[K(xi,xj)]i,j\mathcal{K}_n = [K(x_i, x_j)]_{i,j}, Kn(x)=[K(xi,x)]i=1nK_n(x) = [K(x_i, x)]_{i=1}^n, and YnY_n stacks all observed functions. Series expressions are derived using the eigendecomposition TYvi=βiviT_{\mathcal{Y}} v_i = \beta_i v_i, truncating when i=1mβi\sum_{i=1}^m \beta_i captures >90%>90\% of the trace of TYT_{\mathcal{Y}} (Huang et al., 16 Nov 2025).

4. Computational Complexity and Scalability

Training FFGP models centers on the eigendecomposition of the n×nn \times n Gram matrix (complexity O(nω)O(n^\omega), with 2<ω<2.3762 < \omega < 2.376) and spectral representation of TYT_{\mathcal{Y}}. Each log-likelihood gradient evaluation costs O(Nmcn2m+(p+3)nm)O(N_{mc} n^2 m + (p+3) n m), where NmcN_{mc} is the functional norm computation cost, mm the retained eigenspectrum rank. Prediction at a new input, post-truncation, requires O(mn+m2)O(m n + m^2) (Huang et al., 16 Nov 2025).

Scalable extensions include:

FFGP extends classical GP regression to function-valued input–output mappings in a mathematically consistent way:

  • Multi-output GPs/matrix-valued kernels (e.g., Conti–O’Hagan, Bonilla et al.) handle vector outputs via discretization or fixed basis but cannot natively handle infinite-dimensional function outputs.
  • Functional-input Bayesian optimization (FIBO) targets function-to-scalar mappings in RKHS but does not support functional outputs.
  • FOBO models (functional output Bayesian optimization) address scalar or vector inputs to functional outputs via FPCA discretization, introducing discretization error and potentially losing accuracy on irregular grids.
  • The FFGP achieves full infinite-dimensional modeling without pre-discretization—enabling accurate operator learning and uncertainty quantification (Huang et al., 16 Nov 2025, Lowery et al., 24 Oct 2025).

6. Modern Architectures and Extensions

Several architectures build upon and generalize the FFGP concept:

  • Deep Gaussian Processes for Functional Maps (DGPFM) stack layers of GP-based integral transforms and nonlinear GP activations to model highly nonlinear function-on-function maps. Discrete approximations of kernel integral transforms collapse to direct functional transforms, enabling scalable inference and uncertainty quantification. Empirically, DGPFM outperforms Bayesian neural operators and FNO-based architectures in predictive accuracy and uncertainty calibration on PDE and real-world datasets (Lowery et al., 24 Oct 2025).
  • Linearization-based function-valued GPs for neural operators construct a Laplace-approximated Bayesian posterior in neural operator weight space, propagate it via first-order Taylor expansion, and "curry" the joint GP over input-function/evaluation pairs into a function-on-function GP. Resolution-agnostic, efficient sampling is achieved via the spectral representation of the neural operator, with closed-form predictions for entire output functions (Magnani et al., 7 Jun 2024).

7. Implementation, Practicalities, and Applications

The FFGP paradigm underlies a range of practical frameworks:

  • The GPFDA package implements GP-based function-on-function regression, including both concurrent and historical mean structures, flexible kernel specification (separable, tensor-product, non-separable), and closed-form prediction. Predictions are available both when part of a new response curve is observed (Type I) and when entirely new functional covariates are supplied (Type II). GPFDA also leverages marginal likelihood for hyperparameter selection and supports additive and nonstationary kernels (Konzen et al., 2021).
  • FFGP-based surrogates are used in function-on-function Bayesian optimization (FFBO), where UCB-style acquisition functions use operator-weighted scalarizations, and scalable function-space gradient ascent algorithms search for optimal input functions. Theoretical guarantees include well-posedness of the posterior, decaying truncation error O(m1)O(m^{-1}) with increasing eigendomain truncation, and high-probability sublinear regret in Bayesian optimization (Huang et al., 16 Nov 2025).

FFGP models are deployed in applications involving functional data on irregular grids, spatiotemporal operator learning, and design optimization under complex constraints—offering significant improvements in data efficiency and uncertainty quantification relative to discretization-based GP surrogates, neural operators, or functional regression methods.


References:

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Function-on-Function Gaussian Process (FFGP).