Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 114 tok/s
Gemini 3.0 Pro 53 tok/s Pro
Gemini 2.5 Flash 132 tok/s Pro
Kimi K2 176 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Function-on-Function Bayesian Optimization

Updated 18 November 2025
  • Function-on-Function Bayesian Optimization is a framework that models both inputs and outputs as functions in infinite-dimensional spaces.
  • It employs novel surrogate models like function-on-function Gaussian processes and neural operator networks to efficiently quantify uncertainty.
  • FFBO uses scalarization and functional gradient methods to optimize expensive black-box mappings in applications such as PDE simulations and engineering design.

Function-on-Function Bayesian Optimization (FFBO) refers to a class of Bayesian optimization methods in which both the inputs and outputs of the target function are functions defined on continuous domains, often infinite-dimensional. FFBO generalizes classical Bayesian optimization beyond finite-dimensional (vector) spaces to settings arising in advanced scientific and engineering systems featuring functional parameters, controls, or outputs, such as PDE-based simulators, functional mechanical designs, and scientific operator learning. FFBO introduces new surrogate modeling, acquisition, and optimization strategies to address the mathematical and algorithmic challenges inherent to infinite-dimensional function spaces (Huang et al., 16 Nov 2025, Guilhoto et al., 3 Apr 2024, Jain et al., 2023).

1. Problem Formulation in FFBO

The FFBO framework addresses the global optimization of a mapping

f:XpY,f: \mathcal{X}^p \to \mathcal{Y},

where both the input x=(x1,,xp)Xp[L2(Ωx)]p\bm{x} = (x_1, \ldots, x_p) \in \mathcal{X}^p \subset [L^2(\Omega_x)]^p and the output f(x)Y=L2(Ωy)f(\bm{x}) \in \mathcal{Y} = L^2(\Omega_y) are functions over compact domains Ωx,ΩyRd\Omega_x, \Omega_y \subset \mathbb{R}^d. The typical objective is to maximize a user-specified linear functional of the output,

Lϕf(x)=Ωyϕ(t)f(x)(t)dt,L_\phi f(\bm{x}) = \int_{\Omega_y} \phi(t)\, f(\bm{x})(t)\,dt,

with ϕL2(Ωy)\phi \in L^2(\Omega_y) acting as a weighting function (e.g., Dirac delta, uniform, or a smoothing kernel). The black-box mapping ff is expensive to evaluate, and observations are noisy realizations in function space: yi=f(xi)+εi,εiN(0,τ2IY),i=1,...,n.y_i = f(\bm{x}_i) + \varepsilon_i, \qquad \varepsilon_i \sim \mathcal{N}(0, \tau^2 I_\mathcal{Y}), \quad i=1, ..., n. The optimization goal is to find

x=argmaxxXpLϕf(x),\bm{x}^\star = \arg\max_{\bm{x} \in \mathcal{X}^p} L_\phi f(\bm{x}),

with as few expensive queries as possible, leveraging structure in ff and prior information (Huang et al., 16 Nov 2025).

2. Surrogate Modeling Approaches

2.1 Function-on-Function Gaussian Processes

The function-on-function Gaussian Process (FFGP) surrogate models the mapping f:XpYf: \mathcal{X}^p \to \mathcal{Y} directly in the function space. The FFGP places a Gaussian process prior: f()FFGP(μ(),K(,)),f(\cdot) \sim \mathcal{FFGP}(\mu(\cdot), K(\cdot, \cdot)), where μ:XpY\mu: \mathcal{X}^p \to \mathcal{Y} is the mean and K:Xp×XpL(Y)K: \mathcal{X}^p \times \mathcal{X}^p \to \mathcal{L}(\mathcal{Y}) is an operator-valued kernel. The standard choice is separable: K(xi,xj)=σ2kx(xi,xj)TY,K(\bm{x}_i, \bm{x}_j) = \sigma^2 \, k_x(\bm{x}_i, \bm{x}_j) \, T_\mathcal{Y}, with kxk_x a positive-definite scalar kernel over function inputs (using L2L^2 distance), and TYT_\mathcal{Y} a self-adjoint, positive Hilbert–Schmidt operator (e.g., integral operator with kernel ky(s,t)k_y(s, t)) (Huang et al., 16 Nov 2025).

Posterior inference yields mean and covariance in Y\mathcal{Y}, with explicit expressions for practical computation using truncated eigenbasis decompositions.

2.2 Neural Operator Surrogates

An alternative surrogate uses operator-learning neural networks such as NEON (Neural Epistemic Operator Networks) (Guilhoto et al., 3 Apr 2024). NEON comprises:

  • An encoder-decoder backbone: encoder eξe:RduRdβe_{\xi_e}: \mathbb{R}^{d_u} \to \mathbb{R}^{d_\beta}; decoder dξd:Rdβ×YRdsd_{\xi_d}: \mathbb{R}^{d_\beta} \times \mathcal{Y} \to \mathbb{R}^{d_s}, for modeling the deterministic operator xh(x)(y)x \mapsto h(x)(y).
  • An epistemic uncertainty quantifier ("EpiNet"): a small network ση\sigma_\eta indexed by random variable zz, providing an ensemble of predictions for efficient uncertainty quantification.

The NEON surrogate allows efficient parameterization and training, with ensembles generated via the EpiNet, and achieves parameter efficiency—typically requiring 1–2 orders of magnitude fewer trainable parameters than deep ensembles with similar performance (Guilhoto et al., 3 Apr 2024).

2.3 Composite Function Surrogates

In settings where ff decomposes as a composition f=ghf = g \circ h, where h:XYh: X \rightarrow Y is unknown and expensive while g:YRg: Y \rightarrow \mathbb{R} is known and cheap, the surrogate models hh directly and explicitly propagates uncertainty through gg (Guilhoto et al., 3 Apr 2024, Jain et al., 2023).

3. Acquisition Function Construction in Function Space

FFBO adopts scalarization strategies to reduce function-valued outputs to scalar acquisition criteria:

  • Operator-Based Scalarization: Using a weighting function ϕ\phi, define

gϕ(x)=Lϕf(x)=Ωyϕ(t)f(x)(t)dt,g_\phi(\bm{x}) = L_\phi f(\bm{x}) = \int_{\Omega_y} \phi(t) f(\bm{x})(t) dt,

which inherits a scalar-valued GP structure from the FFGP model; this enables use of established acquisition strategies (Huang et al., 16 Nov 2025).

  • Probabilistic Acquisition Functions: Expected Improvement (EI), Leaky EI (L-EI), and Upper Confidence Bound (UCB) are extended to function-on-function surrogates by propagating the stochastic surrogate's uncertainty through the scalarization. NEON-based approaches estimate acquisition function values via Monte Carlo sampling over epistemic indices zz (Guilhoto et al., 3 Apr 2024).
  • Composite EI and UCB: In the composition setting, composite EI (cEI) and composite UCB (cUCB) acquisition functions are constructed by propagating the GP uncertainty of intermediate functions through the known composition structure. Empirical mean and variance are combined to derive acquisition values (Jain et al., 2023).

4. Optimization over Function Spaces

Optimization in FFBO operates over infinite-dimensional domains. The primary optimization procedure is functional gradient ascent (FGA), relying on Fréchet derivatives computed in the Banach or Hilbert space of functions (Huang et al., 16 Nov 2025).

Algorithmic steps in FGA:

  • Initialize function input x(0)\bm{x}^{(0)}.
  • Iteratively update via

x()=x(1)+γαUCBx(1)\bm{x}^{(\ell)} = \bm{x}^{(\ell-1)} + \gamma \, \nabla \alpha_{\mathrm{UCB}}|_{\bm{x}^{(\ell-1)}}

employing the analytic gradients of the acquisition function with respect to the input functions.

  • Terminate after a fixed number of iterations or upon convergence; return the maximizing x\bm{x} as the next query point.

In NEON-based FFBO, optimization is typically performed via multi-start L-BFGS-B or hybrid local/global strategies, with each candidate xx evaluated across multiple stochastic samples of zz for reliable acquisition estimation (Guilhoto et al., 3 Apr 2024).

5. Theoretical Analysis and Regret Bounds

FFBO admits sublinear cumulative regret under mild assumptions: RT=t=1T[gϕ(x)gϕ(xt)]B1TuTγT+π26R_T = \sum_{t=1}^T [g_\phi(\bm{x}^\star) - g_\phi(\bm{x}_t)] \leq \sqrt{B_1 T u_T \gamma_T} + \frac{\pi^2}{6} where B1B_1, uTu_T, and γT\gamma_T are problem-dependent constants, and γT\gamma_T represents the information gain after TT queries (Huang et al., 16 Nov 2025). The concentration of the FFGP posterior covariance operator is ensured (trace-class), and the finite-mode posterior approximations converge in L2L^2 as the number of eigenmodes mm \rightarrow \infty.

A regularity condition is that the function ff lies in an RKHS defined by the operator-valued kernel, with Gaussian noise scaling as τ2/σ2nc\tau^2 / \sigma^2 \asymp n^{-c}.

A theorem establishes the equivalence between L-EI and EI acquisition functions for bounded objectives, demonstrating that L-EI can be made arbitrarily close to EI by appropriate choice of the leaky slope δ\delta (Guilhoto et al., 3 Apr 2024).

6. Empirical Performance and Practical Applications

Extensive synthetic and real-world experiments establish FFBO's effectiveness:

  • Synthetic Benchmarks: FFBO achieves the lowest simple regret and fastest convergence versus alternative methods (FIBO, FOBO, MTBO, scalarized BO) on one-dimensional function input/output tasks, outperforming models based on functional principal components or fixed-dimensional parameterizations (Huang et al., 16 Nov 2025).
  • Operator Learning Tasks: NEON-based FFBO converges 2–5× faster than standard GP-BO or deep ensemble methods on diffusion-inverse and PDE problems, achieving similar or better final objectives (Guilhoto et al., 3 Apr 2024).
  • Engineering Design: In a 3D-printed aortic-valve case paper, FFBO exhibits faster regret reduction and improved solutions compared to baselines.
  • Telecommunications and Optics: NEON-FFBO, with substantially fewer parameters, outperforms deep ensemble surrogates on optical interferometer alignment and cell-tower coverage tasks (Guilhoto et al., 3 Apr 2024).
  • Dynamic Pricing: FFBO for function compositions, using independent GPs per constituent, outperforms vanilla BO and multi-output GP methods in revenue management applications by leveraging decomposition and scalarization (Jain et al., 2023).

Parameter and Computational Considerations

FFBO Variant Model Complexity Acquisition Estimation Param-Efficiency
FFGP (Huang et al., 16 Nov 2025) Block-operator GP Scalarization + UCB/EI High (trunc. eigenbasis)
NEON (Guilhoto et al., 3 Apr 2024) Neural operator + EpiNet Monte Carlo over ensembles 10–100× fewer than DE
Comp. GP (Jain et al., 2023) Ind. GPs per component Marginal → composite via h O(M n³) per iteration

This table summarizes differences in model structure and resource requirements; "DE" denotes deep ensembles.

7. Strengths, Limitations, and Research Directions

FFBO's strengths include:

  • Direct infinite-dimensional surrogate modeling without ad hoc discretization or FPCA truncation.
  • Operator-valued kernels and neural operator frameworks enabling rich modeling of input–output dependencies in function spaces.
  • Principled scalarization strategies supporting theoretical acquisition guarantees.
  • Demonstrated empirical advantages in convergence rate and sample efficiency across domains.

Limitations include:

  • Computational cost scales with eigenmode truncation (mm), sample size (nn), and the need for high-precision integrations in L2L^2.
  • Selection of operator TYT_\mathcal{Y} and weighting function ϕ\phi often requires domain expert input.
  • Extensions under exploration include nonseparable kernels, multi-objective FFBO, batch query schemes, and automated estimation of scalarization weights (Huang et al., 16 Nov 2025).

A plausible implication is that advances in scalable operator learning and efficient functional optimization could further enhance the applicability of FFBO to scientific and engineering problems involving complex, structured functional relationships.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Function-on-Function Bayesian Optimization (FFBO).