Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 114 tok/s
Gemini 3.0 Pro 53 tok/s Pro
Gemini 2.5 Flash 132 tok/s Pro
Kimi K2 176 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Functional Gradient Ascent (FGA)

Updated 18 November 2025
  • Functional Gradient Ascent (FGA) is a method for optimizing infinite-dimensional function inputs by computing Fréchet derivatives on acquisition functions derived from Gaussian process models.
  • It employs a scalarized upper confidence bound (UCB) acquisition approach with integrated weight functions to effectively navigate high-dimensional function spaces.
  • Empirical benchmarks demonstrate that FGA significantly reduces regret and improves sample efficiency compared to alternative optimization methods in scientific and engineering applications.

The functional gradient ascent algorithm (FGA) is a method for optimizing infinite-dimensional function-valued inputs over acquisition functions derived from a function-on-function Gaussian process (FFGP) model within function-on-function Bayesian optimization (FFBO). This setting arises when both the input and output of the optimization are elements of functional spaces, frequently encountered in advanced scientific and engineering domains requiring optimization of curves, shapes, or other high-dimensional function-valued objects (Huang et al., 16 Nov 2025).

1. Mathematical Formulation of Function-on-Function BO

In FFBO, the objective is to maximize a black-box map f:XpYf: \mathcal{X}^p \to \mathcal{Y}, where XL2(Ωx)\mathcal{X} \subset L^2(\Omega_x) is an infinite-dimensional Hilbert space of square-integrable functions defined on Ωx\Omega_x and Y=L2(Ωy)\mathcal{Y} = L^2(\Omega_y) is the output space (also a Hilbert space of functions on Ωy\Omega_y). The principal challenge lies in modeling posterior uncertainty and optimizing over X\mathcal{X}.

The FFGP prior is defined by a mean μY\mu \in \mathcal{Y} and a separable operator-valued kernel

K(x,x)=σ2kx(x,x)TYK(\bm x, \bm x') = \sigma^2\, k_x(\bm x, \bm x') T_\mathcal{Y}

where kxk_x is a positive-definite scalar-valued kernel (e.g., Matérn kernel on the L2L^2 metric), and TYT_\mathcal{Y} is a nonnegative self-adjoint operator on Y\mathcal{Y} (typically an integral operator).

For observation pairs (xi,yi)(\bm x_i, y_i), the posterior mean and covariance functions of f(x)f(\bm x) are given by

f^(x)=μ+Kn(x)(Kn+τ2IY)1(Yn1nμ) K^(x,x)=K(x,x)Kn(x)(Kn+τ2IY)1Kn(x)\begin{aligned} \hat f(\bm x) &= \mu + \bm K_n(\bm x)^\top (\mathbf{K}_n + \tau^2 I_\mathcal{Y})^{-1} (\bm Y_n - \mathbf{1}_n\mu) \ \hat K(\bm x,\bm x) &= K(\bm x, \bm x) - \bm K_n(\bm x)^\top (\mathbf{K}_n + \tau^2 I_\mathcal{Y})^{-1} \bm K_n(\bm x) \end{aligned}

where Kn\mathbf{K}_n is an n×nn \times n block-matrix of operator-valued kernels and τ2\tau^2 is the functional output noise variance.

2. Scalarized Acquisition and UCB Functional

A scalar acquisition function is defined through a functional LϕL_\phi mapping f(x)f(\bm x) to a scalar by

Lϕ:f(x)Ωyϕ(t)f(x)(t)dtL_\phi: f(\bm x) \mapsto \int_{\Omega_y} \phi(t) f(\bm x)(t)\, dt

for a weight function ϕL2(Ωy)\phi \in L^2(\Omega_y). The scalarized mean and variance become

μ^g(x)=Lϕμ+kx(n)(x)(Kx(n)+τ2cI)1(LϕYn1nμg) k^g(x,x)=c[kx(x,x)kx(n)(x)(Kx(n)+τ2cI)1kx(n)(x)]\begin{aligned} \hat\mu_g(\bm x) &= L_\phi \mu + \bm k_x^{(n)}(\bm x)^\top (\mathbf{K}_x^{(n)} + \tfrac{\tau^2}{c}I)^{-1} (L_\phi \bm Y_n - \mathbf{1}_n\mu^g) \ \hat k_g(\bm x,\bm x) &= c[k_x(\bm x,\bm x) - \bm k_x^{(n)}(\bm x)^\top (\mathbf{K}_x^{(n)} + \tfrac{\tau^2}{c}I)^{-1} \bm k_x^{(n)}(\bm x)] \end{aligned}

where c=ϕ(t)ϕ(s)ky(s,t)dsdtc = \iint \phi(t)\phi(s)k_y(s,t)\,ds\,dt and kyk_y is the kernel underlying TYT_\mathcal{Y}.

The upper confidence bound (UCB) acquisition function is then

αUCB(x)=μ^g(x)+βtk^g(x,x)\alpha_{\mathrm{UCB}}(\bm x) = \hat\mu_g(\bm x) + \sqrt{\beta_t} \sqrt{\hat k_g(\bm x,\bm x)}

with exploration parameter βt\beta_t.

3. Functional Gradient Ascent Algorithm

FGA is applied to maximize αUCB(x)\alpha_{\mathrm{UCB}}(\bm x) over the infinite-dimensional input space: xn+1=argmaxxXpαUCB(x)\bm x_{n+1} = \arg\max_{\bm x \in \mathcal{X}^p} \alpha_{\mathrm{UCB}}(\bm x) The Fréchet derivative (the functional generalization of the gradient) is computed as: xαUCB=g1(x)+g2(x)\nabla_{\bm x} \alpha_{\mathrm{UCB}} = g_1(\bm x) + g_2(\bm x) where

g1(x)=kx(n)(x)(Kx(n)+τ2cI)1(LϕYn1nμg) g2(x)=2ckx(n)(x)(Kx(n)+τ2cI)1kx(n)(x)\begin{aligned} g_1(\bm x) &= \nabla \bm k_x^{(n)}(\bm x)^\top (\mathbf{K}_x^{(n)} + \tfrac{\tau^2}{c}I)^{-1} (L_\phi\bm Y_n - \mathbf{1}_n\mu^g) \ g_2(\bm x) &= -2c\, \nabla \bm k_x^{(n)}(\bm x)^\top (\mathbf{K}_x^{(n)} + \tfrac{\tau^2}{c}I)^{-1} \bm k_x^{(n)}(\bm x) \end{aligned}

For the L2L^2-distance based kernel (e.g., Matérn), the Fréchet derivative with respect to variations hL2h \in L^2 is

xkx(xi,x)(s)=2kx(xix/ψx)x(s)xi(s)ψxxix\nabla_{\bm x} k_x(\bm x_i, \bm x)(s) = 2\, k_x(\|\bm x_i - \bm x\|/\psi_x) \frac{x(s) - x_i(s)}{\psi_x \|\bm x_i - \bm x\|}

The optimization is performed as an iterative ascent, i.e.,

x()=x(1)+γxαUCB(x(1))\bm x^{(\ell)} = \bm x^{(\ell-1)} + \gamma_\ell \nabla_{\bm x} \alpha_{\mathrm{UCB}}(\bm x^{(\ell-1)})

for step-size γ\gamma_\ell until convergence.

4. Theoretical Properties and Convergence

Under regularity conditions (Matérn kernel, trace-class operator TYT_\mathcal{Y}, bounded L2L^2 norms, and appropriate vanishing noise scaling), the posterior for f(x)f(\bm x) remains well-defined, and truncation at finite rank mm yields

f^m(x)f^(x)Cm1\|\hat f_m(\bm x) - \hat f(\bm x)\| \le C m^{-1}

(regardless of the choice of basis). The FFBO regret with FGA satisfies, with high probability,

RTB1TβTγT+π26R_T \le \sqrt{B_1 T \beta_T \gamma_T} + \frac{\pi^2}{6}

with information gain γT\gamma_T, thus simple regret is O(T)O^*(\sqrt{T}). This provides a nontrivial guarantee for global optimization in infinite-dimensional settings (Huang et al., 16 Nov 2025).

5. Empirical Performance and Benchmarks

FGA-based FFBO is evaluated against baselines: FIBO (function-input BO), FOBO (function-output BO), and MTBO (multi-task BO). On both synthetic and real-world tasks involving optimization over function-valued domains (e.g., one-dimensional curves, stress–strain waveform matching), FFBO achieves faster convergence and lower regret across all tested scenarios for the same query budget, with robustness to noise and consistent superiority in sample efficiency (Huang et al., 16 Nov 2025).

6. Context within Operator-Based Bayesian Optimization

Alternative approaches in the operator learning and surrogate modeling literature include surrogate construction via parametric operator networks (e.g., NEON, as in (Guilhoto et al., 3 Apr 2024)), which operate over deterministic mappings h:XC(Y,Rds)h: X \to C(\mathcal{Y}, \mathbb{R}^{d_s}) and use backpropagation-based optimizers (e.g., L-BFGS) in the reduced design space XRduX \subset \mathbb{R}^{d_u}. These methods contrast with FGA in their reliance on finite-parametric representations and backpropagation instead of Fréchet-based optimization over function spaces.

7. Implications and Extensions

FGA represents a principled approach for functional optimization in the fully infinite-dimensional regime, circumventing intrinsic limitations of finite-parameterization. A plausible implication is that FGA enables direct exploitation of smoothness and structural priors on function spaces—reflected in the kernel and operator choices—yielding both theoretical and empirical superiority in problems where function-valued inputs and outputs are intrinsic (Huang et al., 16 Nov 2025). Such methods set the foundation for further advances in functional sequential design, shape optimization, and scientific computing with high-dimensional function spaces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Functional Gradient Ascent Algorithm (FGA).