Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bayesian Interpolating Neural Network

Updated 6 February 2026
  • B-INN is a probabilistic surrogate modeling framework that extends 1D interpolating networks to multi-dimensional domains using tensor algebra and Bayesian inference.
  • The model employs a block-alternating least squares algorithm with closed-form Bayesian updates to achieve efficient, linear-scaling uncertainty estimation.
  • Empirical results demonstrate that B-INN delivers competitive accuracy and substantial speedups over traditional Gaussian processes and Bayesian neural networks in active learning settings.

The Bayesian Interpolating Neural Network (B-INN) is a probabilistic surrogate modeling framework designed to address scalability and reliability challenges in uncertainty quantification for large-scale, industry-driven simulations. The B-INN combines high-order interpolation, tensor decompositions, and alternating direction methods to achieve data efficiency and robust uncertainty estimation, enabling practical active learning in high-dimensional, data-intensive physical systems (Park et al., 30 Jan 2026).

1. Mathematical Formulation and Model Architecture

The Bayesian Interpolating Neural Network is founded on the extension of 1D Interpolating Neural Networks (INN) to multidimensional domains through tensor algebra. In the one-dimensional case, a scalar function y(x)y(x) is expressed as a linear combination of JJ fixed interpolation basis functions {ϕj(x)}j=1J\{\phi_j(x)\}_{j=1}^J with trainable weights: y(x)=j=1Jϕj(x)wjy(x) = \sum_{j=1}^J \phi_j(x)w_j where each ϕj\phi_j is selected for numerical qualities such as compact support and smoothness.

For inputs in DD dimensions, a na\"ive full grid combination is computationally infeasible for large DD. Instead, the B-INN uses a rank-MM CANDECOMP/PARAFAC (CP) tensor decomposition: y(x1,,xD)=m=1Md=1Dfd(m)(xd)y(x_1,\ldots,x_D) = \sum_{m=1}^M \prod_{d=1}^D f^{(m)}_d(x_d) with fd(m)(xd)=j=1Jdϕd,j(xd)wd,j(m)f^{(m)}_d(x_d) = \sum_{j=1}^{J_d} \phi_{d,j}(x_d)w^{(m)}_{d,j}. This architecture can be interpreted as a shallow neural network with a single hidden layer of MM neurons, each neuron computing the product across DD dimensions.

The block-alternating scheme freezes weights in D1D-1 dimensions to focus the inference or update step on a single dimension, yielding a design matrix XdRN×(MJd)X_d \in \mathbb{R}^{N \times (MJ_d)} and allowing efficient iterative optimization.

2. Bayesian Inference and Alternating Least Squares

Transitioning from the interpolating neural network to its Bayesian instantiation, independent spherical Gaussian priors are imposed on all weights per block: wd,j(m)N(0,σw2)w_{d,j}^{(m)} \sim \mathcal{N}(0,\sigma_w^2)

Given the data, Bayesian linear regression is performed dimension-wise: u=Xdwd+ε,εN(0,σn2I)\mathbf{u} = X_d w_d + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma_n^2 I) The posterior for wdw_d is available in closed form: wdDN(μdpost,Σdpost)w_d|\mathcal{D} \sim \mathcal{N}(\mu_d^{\mathrm{post}},\Sigma_d^{\mathrm{post}}) where

Σdpost=(σn2XdTXd+σw2I)1,μdpost=Σdpost(σn2XdTu)\Sigma_d^{\mathrm{post}} = \left( \sigma_n^{-2} X_d^T X_d + \sigma_w^{-2}I \right)^{-1}, \qquad \mu_d^{\mathrm{post}} = \Sigma_d^{\mathrm{post}} (\sigma_n^{-2} X_d^T \mathbf{u})

The block-wise alternating update cycles through dimensions, each time reconstructing XdX_d with updated means in other subspaces. Convergence is achieved by iterating this procedure for a fixed number of Alternating Least Squares (ALS) steps.

3. Connection to Gaussian Processes

The B-INN’s function space is contained within that of Gaussian processes (GPs). In the limit as the mode rank MM\to\infty, and with normalization yˉ(M)=y(M)/M\bar{y}^{(M)}=y^{(M)}/\sqrt{M}, the prior over functions converges (via the multivariate Central Limit Theorem) to a Gaussian process with kernel: K(x,x)=d=1D[σw2j=1Jdϕd,j(xd)ϕd,j(xd)]K(x,x') = \prod_{d=1}^D \left[ \sigma_w^2 \sum_{j=1}^{J_d} \phi_{d,j}(x_d)\phi_{d,j}(x'_d) \right] With Gaussian additive noise, the finite-dimensional posteriors of B-INN converge to those of GPs with kernel KK as MM\rightarrow\infty. Thus, B-INN provides a tractable, low-rank surrogate whose function class approaches that of GPs in the infinite-rank regime.

4. Complexity, Predictive Formulas, and Algorithm

After block-wise regression, predictions at a test input x+=(x1+,...,xD+)x^+ = (x_1^+, ..., x_D^+) use the closed-form mean: E[y(x+)]=m=1Md=1D[Φd(xd+)Tμd(m)]\mathbb{E}[y(x^+)] = \sum_{m=1}^M \prod_{d=1}^D \left[\Phi_d(x_d^+)^T \mu_d^{(m)}\right] The variance (separating epistemic and aleatoric contributions) is mode-wise: Var[y(x+)]=m=1M{d=1D[(Φd+)TΣd(m)Φd+]+}(E[y(x+)])2\mathrm{Var}[y(x^+)] = \sum_{m=1}^M \Bigl\{ \prod_{d=1}^D\left[(\Phi_d^+)^T \Sigma_d^{(m)}\Phi_d^+\right] + \dots \Bigr\} - \left(\mathbb{E}[y(x^+)]\right)^2 with Φd(xd+)\Phi_d(x_d^+) denoting the vector of interpolation basis evaluations for dimension dd.

The computational complexity per ALS sweep is O(TDN)\mathcal{O}(T D N), with TT iterations and DD dimensions, crucially linear in sample size NN given that MJdNMJ_d \ll N. For comparison, standard GPs require O(N3)\mathcal{O}(N^3) operations.

Training Algorithm:

  1. Initialize all μd(m)\mu_d^{(m)} to zero.
  2. For each ALS iteration t=1,,Tt=1,\dots,T and each dimension dd:
    • Compute frozen factors gd(i)g_d^{(i)} for all ii.
    • Build design matrix XdX_d.
    • Solve Bayesian linear regression (BLR): obtain Σdpost,μdpost\Sigma_d^{\mathrm{post}}, \mu_d^{\mathrm{post}}.
    • Update weights μdμdpost\mu_d \leftarrow \mu_d^{\mathrm{post}}.
  3. Return {μd,Σd}\{\mu_d, \Sigma_d\} for all dd. Inference of predictive mean and variance follows the closed formulas above.

5. Empirical Performance and Benchmark Results

B-INN demonstrates superior computational and statistical performance in surrogate modeling tasks:

  • 1D Regression (Synthetic): For sin(3x)+0.3cos(9x)\sin(3x)+0.3\cos(9x) with noise, B-INN matches GP accuracy (RMSE102\mathrm{RMSE}\approx 10^{-2}) using J=20J=20 basis at 20×20\times lower computation, while BNNs present higher error and significant sampling overhead. Training time for B-INN with N=106N=10^6 is approximately 1s (100×100\times104×10^4\times faster than fastest BNNs).
  • Aerodynamic Surrogate (BlendedNet 7→4):

| Model | RMSE | Training Time (s) | |---------------|----------|---------------------| | B-INN | 0.02–0.05| 20–30 | | BNN-HMC | 0.02–0.05| 575 | | BNN-VI | 0.05–0.08| 550–610 |

B-INN achieves comparable or better accuracy at least 20×20\times faster.

  • Active Learning on PDEs:
    • Poisson (3D+1 parametric): B-INN initial RMSE 2.6×1052.6\times10^{-5}, best RMSE after AL 2.1×1062.1\times10^{-6} (training time 261s), compared to BNN-HMC (1.5×1031.5\times10^{-3}/3.8×1043.8\times10^{-4}, 8233s) and BNN-VI (1.4×1021.4\times10^{-2}/1.0×1031.0\times10^{-3}, 19321s).
    • Heat Eq (2D+time+2 param): B-INN initial RMSE 6.6×1046.6\times10^{-4}, best 1.5×1041.5\times10^{-4}, training time 7297s; BNN-HMC 5.1×1025.1\times10^{-2}/1.2×1021.2\times10^{-2}, 32984s; BNN-VI 2.9×1022.9\times10^{-2}/2.2×1032.2\times10^{-3}, 91960s.

B-INN yields 10–50×\times speedups with lower error and more robust uncertainty than BNN alternatives.

6. Large-Scale Active Learning and Applications

B-INN is engineered for workflows where training data is acquired at great cost (e.g., industrial simulations with high-fidelity physics engines). Its O(N)\mathcal{O}(N) retraining complexity, closed-form uncertainty quantification, and tensorized basis enable practical active learning loops at scales unattainable by full GPs or standard BNNs.

B-INN’s calibrated epistemic variance quantifies model uncertainty, directly informing acquisition of new data points. This enables error reductions of approximately 90% within tens of active learning rounds, as opposed to hundreds required by less data-efficient architectures. Variational BNNs may systematically underestimate uncertainty, resulting in poor acquisition and slow error reduction. By contrast, B-INN maintains uncertainty calibration across scales, supporting robust design and simulation.

7. Theoretical and Practical Significance

The B-INN embodies a hybrid point between classical interpolation, low-rank tensor methods, and modern Bayesian inference. Its function class, as MM\to\infty, is a proper subset of GPs, allowing rigorous theoretical comparison. Linear scaling with sample size and tractable posterior updates distinguish it from other surrogate paradigms in both theory and practice.

B-INN constitutes a practical, well-calibrated foundation for large-scale, uncertainty-driven modeling, simulation, and design, particularly where rapid retraining and reliable uncertainty quantification are vital (Park et al., 30 Jan 2026). Its empirical speedups (20–10,000×\times over GPs/BNNs) and accuracy have direct implications in computational physics, engineering design, and scientific computing contexts where data/compute bottlenecks are prevalent.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bayesian Interpolating Neural Network (B-INN).