Papers
Topics
Authors
Recent
2000 character limit reached

Functional Bayesian Additive Regression Trees

Updated 28 December 2025
  • Functional Bayesian Additive Regression Trees (FBART) is a nonparametric model that integrates regression trees with spline-based basis expansions to capture complex functional responses.
  • It employs a Bayesian backfitting MCMC algorithm to update tree structures and leaf parameters, while incorporating shape constraints like monotonicity for enhanced interpretability.
  • FBART demonstrates strong empirical performance in applications such as battery capacity-fade analysis and spatial analytics, supported by theoretical posterior contraction guarantees.

Functional Bayesian Additive Regression Trees (FBART) are a class of fully nonparametric Bayesian models tailored for flexible function-on-scalar regression. FBART extends the Bayesian additive regression tree (BART) paradigm to directly model functional responses, combining the expressiveness of spline-based basis function expansion with the adaptability of regression trees for capturing nonlinear, heterogeneous covariate effects. A variant, shape-constrained FBART (S-FBART), further introduces priors that impose shape constraints, such as monotonicity or convexity, directly on the functional response, enhancing interpretability and estimation accuracy when prior shape information is available (Cao et al., 24 Feb 2025, Cao et al., 10 Mar 2025).

1. Model Construction and Mathematical Foundations

FBART models the relationship between scalar covariates xRpx \in \mathbb{R}^p and a functional response Yi(t)Y_i(t), t[0,1]t \in [0,1], observed for each sample ii. The response is projected into a basis representation using B-splines of order qq with equally spaced knots, yielding φ(t)=(φ1(t),,φJ(t))T\boldsymbol{\varphi}(t) = (\varphi_1(t), \ldots, \varphi_J(t))^T. Each subject ii's curve Yi(t)Y_i(t) is approximated as φ(t)Tβi\boldsymbol{\varphi}(t)^T \beta_i, where βiRJ\beta_i \in \mathbb{R}^J are subject-specific spline coefficients.

The regression map from covariates to coefficient vectors, xβ(x)x \mapsto \beta(x), is modeled as a sum over KK regression trees:

Yi(t)=k=1Kgk(xi;Tk,Mk)(t)+ϵi(t),ϵi(t)N(0,σ2).Y_i(t) = \sum_{k=1}^K g_k(x_i; T_k, M_k)(t) + \epsilon_i(t), \quad \epsilon_i(t) \sim N(0, \sigma^2).

Each tree TkT_k partitions the covariate space into LkL_k hyperrectangles DkD_{k\ell}, with each leaf \ell associated with a coefficient vector μkRJ\mu_{k\ell} \in \mathbb{R}^J. The tree contribution is

gk(x;t)==1Lk[φ(t)Tμk]1xDk,g_k(x; t) = \sum_{\ell=1}^{L_k} [\boldsymbol{\varphi}(t)^T \mu_{k\ell}] \cdot 1_{x \in D_{k\ell}},

and the overall predicted function at xx is ΞT,M(t;x)=k=1K=1Lkφ(t)Tμk1xDk\Xi_{T,M}(t; x) = \sum_{k=1}^K \sum_{\ell=1}^{L_k} \boldsymbol{\varphi}(t)^T \mu_{k\ell} \cdot 1_{x \in D_{k\ell}} (Cao et al., 24 Feb 2025).

In spatial settings (e.g., basketball shot charts), the functional response can be over a multidimensional domain sBRds \in B \subset \mathbb{R}^d, with the mean surface modeled as Ξx(s)=f(s)Tη(x)\Xi_x(s) = \mathbf{f}(s)^T \boldsymbol{\eta}(x), where f=(f1,,fJ)T\mathbf{f} = (f_1, \ldots, f_J)^T is a (possibly adaptive) basis in ss and η(x)\boldsymbol{\eta}(x) is a JJ-vector modeled by a sum-of-trees in xx (Cao et al., 10 Mar 2025).

2. Prior Specification and Shape Constraints

Priors are imposed independently over tree structures TkT_k, their leaf-parameter sets Mk={μk}M_k = \{\mu_{k\ell}\}, and the noise variance σ2\sigma^2:

π({Tk,Mk}k=1K,σ2)=π(σ2)k=1Kπ(MkTk)π(Tk).\pi(\{T_k, M_k\}_{k=1}^K, \sigma^2) = \pi(\sigma^2) \prod_{k=1}^K \pi(M_k | T_k) \pi(T_k).

Leaf parameters μk\mu_{k\ell} follow N(μ0,V0)\mathcal{N}(\mu_0, V_0) (typically μ0=0\mu_0 = 0, V0=IJ/KV_0 = I_J/K), and σ2(νλ)/χν2\sigma^2 \sim (\nu\lambda)/\chi^2_\nu. Tree splits at depth dd occur with probability psplit(d)=aγdp_{\text{split}}(d) = a\gamma^d, with a splitting variable and cut-point selected uniformly.

Shape constraints are enforced by replacing the Gaussian prior on μ\mu in each leaf with a truncated normal:

μND(μ0,V0)exp[12(μμ0)TV01(μμ0)]1Dμ0,\mu \sim \mathcal{N}^D(\mu_0, V_0) \propto \exp\left[-\frac{1}{2}(\mu - \mu_0)^T V_0^{-1} (\mu - \mu_0)\right] \, 1_{D\mu \geq 0},

where DD encodes the linear constraint: D=IJD = I_J (nonnegativity), first-difference (monotonicity), or second-difference (convexity with knot spacing). This enforces the functional constraint globally for all xx (Cao et al., 24 Feb 2025).

3. Posterior Inference and Bayesian Backfitting

FBART employs a custom Bayesian backfitting Markov chain Monte Carlo (MCMC) algorithm cycling sequentially over trees:

  1. Partial Residuals: For tree kk, compute residuals ri(tij)=Yi(tij)kkgk(xi;Tk,Mk)(tij)r_i(t_{ij}) = Y_i(t_{ij}) - \sum_{k' \neq k} g_{k'}(x_i; T_{k'}, M_{k'})(t_{ij}).
  2. Tree Update: Propose a new topology TkT_k' via grow/prune/change/swap moves with Metropolis–Hastings acceptance. The marginal likelihood p(rTk,)p(r | T_k, \cdot) is calculable in closed form by integrating out MkM_k.
  3. Leaf Parameter Update: Update μk\mu_{k\ell} from its full conditional Gaussian; for S-FBART, sample from its truncated normal conditional using efficient algorithms (e.g., minimax tilting).
  4. Variance Update: σ2\sigma^2 is updated from its inverse-gamma conditional.

In spatial or adaptive FBART (AFBART), the basis coefficients are also Bayesian objects, sampled from conjugate Gaussians subject to smoothness penalties and orthonormality constraints on the basis matrix FF at each MCMC iteration (Cao et al., 10 Mar 2025).

4. Theoretical Guarantees

FBART's theoretical analysis establishes posterior contraction rates under general design and smoothness conditions. Assuming the true regression map Ξ0\Xi_0 resides in a mixed Hölder class HCα,βHC^{\alpha, \beta}, and setting the number of spline bases JnNnβ/[α(2β+p)+β]J_n \sim N_n^{\beta/[\alpha(2\beta+p)+\beta]} and psplit(d)=γJnlogJn+dp_{\text{split}}(d) = \gamma^{J_n\log J_n + d}, the posterior concentrates at rate

ϵn=Nnαβ/[α(2β+p)+β](logNn)1/2\epsilon_n = N_n^{-\alpha\beta/[\alpha(2\beta+p)+\beta]} (\log N_n)^{1/2}

in empirical L2L_2 distance. For S-FBART, the same rate applies if Ξ0\Xi_0 is κ\kappa-strictly shape-constrained (lower bounded derivatives), and the constrained B-spline approximates at O(Jα)O(J^{-\alpha}) error (Cao et al., 24 Feb 2025).

5. Empirical Validation and Real-World Applications

Simulations in (Cao et al., 24 Feb 2025) benchmark FBART and S-FBART against BART, monotone BART, Bayesian functional-on-scalar regression (FOSR), and local Fréchet regression under nonlinear, nonsmooth, or mixed-generating processes. FBART and S-FBART substantially outperform competitors in root mean squared prediction error (RMSPE), mean interval score (MIS), and mean continuous ranked probability score (MCRPS). S-FBART further improves accuracy under valid shape constraints, particularly in moderate-noise regimes.

Application to battery capacity-fade (strictly monotonic) and wage-experience (concave) data verify that S-FBART achieves the lowest predictive error and best uncertainty quantification metrics relative to competing approaches.

In spatial function-on-scalar regression, FBART and AFBART have been applied to basketball shot selection analytics, modeling shot-intensity surfaces as functions of player covariates in two spatial dimensions. Adaptive FBART (with learned basis functions) achieves superior out-of-sample RMSPE and MCRPS and provides interpretable variable-importance profiles, outperforming classical and fixed-basis models in high-dimensional, nonstationary settings (Cao et al., 10 Mar 2025).

6. Extensions and Ongoing Research Directions

AFBART generalizes FBART by adaptively learning the basis functions for the functional domain, which enhances model fit and computational efficiency, especially when the functional response exhibits complex, multidimensional, or nonstationary features. The basis functions are regularized by thin-plate-spline penalties, and their identifiability is enforced by orthonormalization at each iteration (Cao et al., 10 Mar 2025).

Potential future directions include integrating functional covariates, extending to irregularly observed functional data, hierarchical models over multiple levels (e.g., longitudinal data structures), and further exploration of shape-constrained modeling frameworks for more general types of prior knowledge. A plausible implication is that AFBART architectures could be directly applicable to other scientific domains with high-dimensional, structured functional responses, such as genomics, environmental statistics, and market analytics.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Functional Bayesian Additive Regression Trees (FBART).