Functional Bayesian Additive Regression Trees

Updated 28 December 2025

Functional Bayesian Additive Regression Trees (FBART) is a nonparametric model that integrates regression trees with spline-based basis expansions to capture complex functional responses.
It employs a Bayesian backfitting MCMC algorithm to update tree structures and leaf parameters, while incorporating shape constraints like monotonicity for enhanced interpretability.
FBART demonstrates strong empirical performance in applications such as battery capacity-fade analysis and spatial analytics, supported by theoretical posterior contraction guarantees.

Functional Bayesian Additive Regression Trees (FBART) are a class of fully nonparametric Bayesian models tailored for flexible function-on-scalar regression. FBART extends the Bayesian additive regression tree (BART) paradigm to directly model functional responses, combining the expressiveness of spline-based basis function expansion with the adaptability of regression trees for capturing nonlinear, heterogeneous covariate effects. A variant, shape-constrained FBART (S-FBART), further introduces priors that impose shape constraints, such as monotonicity or convexity, directly on the functional response, enhancing interpretability and estimation accuracy when prior shape information is available (Cao et al., 24 Feb 2025, Cao et al., 10 Mar 2025).

1. Model Construction and Mathematical Foundations

FBART models the relationship between scalar covariates $x \in \mathbb{R}^p$ and a functional response $Y_i(t)$ , $t \in [0,1]$ , observed for each sample $i$ . The response is projected into a basis representation using B-splines of order $q$ with equally spaced knots, yielding $\boldsymbol{\varphi}(t) = (\varphi_1(t), \ldots, \varphi_J(t))^T$ . Each subject $i$ 's curve $Y_i(t)$ is approximated as $\boldsymbol{\varphi}(t)^T \beta_i$ , where $\beta_i \in \mathbb{R}^J$ are subject-specific spline coefficients.

The regression map from covariates to coefficient vectors, $x \mapsto \beta(x)$ , is modeled as a sum over $K$ regression trees:

$Y_i(t) = \sum_{k=1}^K g_k(x_i; T_k, M_k)(t) + \epsilon_i(t), \quad \epsilon_i(t) \sim N(0, \sigma^2).$

Each tree $T_k$ partitions the covariate space into $L_k$ hyperrectangles $D_{k\ell}$ , with each leaf $\ell$ associated with a coefficient vector $\mu_{k\ell} \in \mathbb{R}^J$ . The tree contribution is

$g_k(x; t) = \sum_{\ell=1}^{L_k} [\boldsymbol{\varphi}(t)^T \mu_{k\ell}] \cdot 1_{x \in D_{k\ell}},$

and the overall predicted function at $x$ is $\Xi_{T,M}(t; x) = \sum_{k=1}^K \sum_{\ell=1}^{L_k} \boldsymbol{\varphi}(t)^T \mu_{k\ell} \cdot 1_{x \in D_{k\ell}}$ (Cao et al., 24 Feb 2025).

In spatial settings (e.g., basketball shot charts), the functional response can be over a multidimensional domain $s \in B \subset \mathbb{R}^d$ , with the mean surface modeled as $\Xi_x(s) = \mathbf{f}(s)^T \boldsymbol{\eta}(x)$ , where $\mathbf{f} = (f_1, \ldots, f_J)^T$ is a (possibly adaptive) basis in $s$ and $\boldsymbol{\eta}(x)$ is a $J$ -vector modeled by a sum-of-trees in $x$ (Cao et al., 10 Mar 2025).

2. Prior Specification and Shape Constraints

Priors are imposed independently over tree structures $T_k$ , their leaf-parameter sets $M_k = \{\mu_{k\ell}\}$ , and the noise variance $\sigma^2$ :

$\pi(\{T_k, M_k\}_{k=1}^K, \sigma^2) = \pi(\sigma^2) \prod_{k=1}^K \pi(M_k | T_k) \pi(T_k).$

Leaf parameters $\mu_{k\ell}$ follow $\mathcal{N}(\mu_0, V_0)$ (typically $\mu_0 = 0$ , $V_0 = I_J/K$ ), and $\sigma^2 \sim (\nu\lambda)/\chi^2_\nu$ . Tree splits at depth $d$ occur with probability $p_{\text{split}}(d) = a\gamma^d$ , with a splitting variable and cut-point selected uniformly.

Shape constraints are enforced by replacing the Gaussian prior on $\mu$ in each leaf with a truncated normal:

$\mu \sim \mathcal{N}^D(\mu_0, V_0) \propto \exp\left[-\frac{1}{2}(\mu - \mu_0)^T V_0^{-1} (\mu - \mu_0)\right] \, 1_{D\mu \geq 0},$

where $D$ encodes the linear constraint: $D = I_J$ (nonnegativity), first-difference (monotonicity), or second-difference (convexity with knot spacing). This enforces the functional constraint globally for all $x$ (Cao et al., 24 Feb 2025).

3. Posterior Inference and Bayesian Backfitting

FBART employs a custom Bayesian backfitting Markov chain Monte Carlo (MCMC) algorithm cycling sequentially over trees:

Partial Residuals: For tree $k$ , compute residuals $r_i(t_{ij}) = Y_i(t_{ij}) - \sum_{k' \neq k} g_{k'}(x_i; T_{k'}, M_{k'})(t_{ij})$ .
Tree Update: Propose a new topology $T_k'$ via grow/prune/change/swap moves with Metropolis–Hastings acceptance. The marginal likelihood $p(r | T_k, \cdot)$ is calculable in closed form by integrating out $M_k$ .
Leaf Parameter Update: Update $\mu_{k\ell}$ from its full conditional Gaussian; for S-FBART, sample from its truncated normal conditional using efficient algorithms (e.g., minimax tilting).
Variance Update: $\sigma^2$ is updated from its inverse-gamma conditional.

In spatial or adaptive FBART (AFBART), the basis coefficients are also Bayesian objects, sampled from conjugate Gaussians subject to smoothness penalties and orthonormality constraints on the basis matrix $F$ at each MCMC iteration (Cao et al., 10 Mar 2025).

4. Theoretical Guarantees

FBART's theoretical analysis establishes posterior contraction rates under general design and smoothness conditions. Assuming the true regression map $\Xi_0$ resides in a mixed Hölder class $HC^{\alpha, \beta}$ , and setting the number of spline bases $J_n \sim N_n^{\beta/[\alpha(2\beta+p)+\beta]}$ and $p_{\text{split}}(d) = \gamma^{J_n\log J_n + d}$ , the posterior concentrates at rate

$\epsilon_n = N_n^{-\alpha\beta/[\alpha(2\beta+p)+\beta]} (\log N_n)^{1/2}$

in empirical $L_2$ distance. For S-FBART, the same rate applies if $\Xi_0$ is $\kappa$ -strictly shape-constrained (lower bounded derivatives), and the constrained B-spline approximates at $O(J^{-\alpha})$ error (Cao et al., 24 Feb 2025).

5. Empirical Validation and Real-World Applications

Simulations in (Cao et al., 24 Feb 2025) benchmark FBART and S-FBART against BART, monotone BART, Bayesian functional-on-scalar regression (FOSR), and local Fréchet regression under nonlinear, nonsmooth, or mixed-generating processes. FBART and S-FBART substantially outperform competitors in root mean squared prediction error (RMSPE), mean interval score (MIS), and mean continuous ranked probability score (MCRPS). S-FBART further improves accuracy under valid shape constraints, particularly in moderate-noise regimes.

Application to battery capacity-fade (strictly monotonic) and wage-experience (concave) data verify that S-FBART achieves the lowest predictive error and best uncertainty quantification metrics relative to competing approaches.

In spatial function-on-scalar regression, FBART and AFBART have been applied to basketball shot selection analytics, modeling shot-intensity surfaces as functions of player covariates in two spatial dimensions. Adaptive FBART (with learned basis functions) achieves superior out-of-sample RMSPE and MCRPS and provides interpretable variable-importance profiles, outperforming classical and fixed-basis models in high-dimensional, nonstationary settings (Cao et al., 10 Mar 2025).

6. Extensions and Ongoing Research Directions

AFBART generalizes FBART by adaptively learning the basis functions for the functional domain, which enhances model fit and computational efficiency, especially when the functional response exhibits complex, multidimensional, or nonstationary features. The basis functions are regularized by thin-plate-spline penalties, and their identifiability is enforced by orthonormalization at each iteration (Cao et al., 10 Mar 2025).

Potential future directions include integrating functional covariates, extending to irregularly observed functional data, hierarchical models over multiple levels (e.g., longitudinal data structures), and further exploration of shape-constrained modeling frameworks for more general types of prior knowledge. A plausible implication is that AFBART architectures could be directly applicable to other scientific domains with high-dimensional, structured functional responses, such as genomics, environmental statistics, and market analytics.