Data-Driven Acoustic Surrogate Models

Updated 13 March 2026

Data-driven acoustic surrogate models are computational constructs that approximate complex acoustic phenomena using statistical and physics-guided machine learning techniques.
They enable rapid evaluation and optimization in applications such as material design, room acoustics, and underwater sound propagation by replacing expensive numerical solvers.
Incorporating probabilistic frameworks and physics-based constraints, these surrogates deliver high accuracy, reduced computational cost, and enhanced interpretability across diverse acoustic domains.

Data-driven acoustic surrogate models are computational constructs that approximate complex input–output relationships governing acoustic phenomena using data and statistical or machine-learning techniques rather than direct numerical solutions of the governing physical equations. These surrogates enable rapid evaluation, calibration, optimization, and uncertainty quantification of acoustical systems—ranging from material design and wave propagation to room acoustics and vibroacoustic performance—where conventional high-fidelity models (e.g., finite element or boundary element solvers) may be prohibitively computationally expensive or data-intensive. The current landscape of data-driven acoustic surrogates encompasses polynomial metamodels, physics-guided regressors, probabilistic frameworks, physics-embedded neural architectures, and deep generative models, each tailored to different application domains, data regimes, and degrees of available physical knowledge.

1. Polynomial Metamodeling for Multiscale Acoustic Materials

Classical polynomial surrogates, as introduced in the context of acoustical material design, rely on global polynomial expansions in normalized parameter spaces to approximate the map from microstructural descriptors to effective acoustic properties. Given a vector of input parameters $m = (m_1, ..., m_n)$ describing, for example, porosity or pore shape, the surrogate map $\hat{q}(m)\approx q(m)$ is formulated as a sum over multivariate polynomials:

$u(\xi) \approx u_p(\xi) = \sum_{|\alpha| \leq p} c_\alpha P_\alpha(\xi)$

where $\xi \in [-1,1]^n$ is the normalized variable, $P_\alpha$ are tensor-product Legendre polynomials, and $c_\alpha$ are coefficient vectors determined by projection via Gauss–Legendre quadrature. For low-dimensional parameter spaces ( $n \lesssim 3$ ), total-degree $p$ in the range $10$–$15$ yields mean relative errors below $1\%$ across hundreds of independent test cases, with maximal observed errors $\lesssim2\%$ . The surrogate model reduces online prediction times by orders of magnitude (e.g., from seconds/minutes to sub-millisecond per evaluation), facilitating its deployment in design optimization loops for absorption, impedance, or other transport properties (Trinh et al., 2017).

Extensions to higher-dimensional spaces ( $n>5$ ) employ sparse grids or $\ell_1$ -minimization to mitigate the curse of dimensionality, and multi-output regression for vector-valued targets. Analytical convergence rates depend on the smoothness of the solution map; exponential if analytic, algebraic otherwise.

2. Machine Learning Surrogates for Vibroacoustics and Room Acoustics

For broader classes of acoustic systems, including sound transmission loss (STL) in vibroacoustics or building room acoustics, machine learning-based surrogates model the mapping from engineered or physical features to frequency- or metric-dependent acoustic outputs.

Vibroacoustic Transmission Loss

Benchmarking multiple machine learning regressors (fully-connected neural networks, Gaussian process regressors, random forests, gradient boosting) for STL, the best performing surrogates incorporate engineering features encoding mass law, bending stiffness, and modal density:

Inputs: material density, Young’s modulus, thickness, damping ratio, geometric layout, and derived features.
Outputs: STL curves over frequency bands.
Training: Latin-hypercube sampling (LHS) of input space, MSE loss, regularization, cross-validation.
Physics-guided feature engineering lowers RMSE from $\sim 0.2$ dB to $0.12$ dB (analytic benchmarks), band-averaging further reduces errors for numerical models.
NN surrogates yield RMSE < 3 dB across STL models except full FEM, where errors stabilize at 5 dB for $N \sim 2000$ .
RF and GBT offer rapid training with intrinsic feature importance diagnostics and interpretability at slight cost to predictive accuracy (Cunha et al., 2022).

Room Acoustic Performance

For early-stage building design, surrogate DNNs predict core room-acoustic metrics— $T_{30}$ , $EDT$ , $C_{80}$ , $D_{50}$ , and $STI$ —from geometric and material input vectors. Networks are shallow FC architectures ($5$–$9$ layers, $50$–$150$ units/layer, ReLU activations, dropout), trained on $2916$ simulation-based room configurations. Mean absolute percentage errors of $1$– $3\%$ on test, $2$– $12\%$ on validation indicate robust in-domain prediction. Categorical variables (e.g., shading, furniture) are one-hot encoded; continuous features are normalized (Abarghooie et al., 2021).

3. Physics-Guided and Probabilistic Surrogate Modeling

Surrogate construction benefits from embedding physical models or constraints, especially in settings with structured uncertainties or scarce data.

Bayesian Surrogates for Noise and Transfer Functions

For vehicle NVH (noise, vibration, harshness), Bayesian generalized additive models (GAMs) estimate the sound pressure level inside cabins as a function of speed, frequency, and categorical design variables. The surrogate combines physics-motivated basis expansions (polynomial or Gaussian), hierarchical priors, and heteroscedastic error models. Bayesian inference via NUTS (PyMC3) yields predictive means and credible intervals; parametric bootstrap augments validation. Cross-validation $R^2$ is $\approx 0.90$ , and analytic evaluation requires milliseconds (Prakash et al., 2022).

Physics-Aided Ray-based Surrogates

In data-constrained underwater acoustics, the Ray-Basis Neural Network (RBNN) architecture parameterizes the sound field as a sparse sum over exact high-frequency solutions (plane or spherical waves):

$\bar{p}(\mathbf{r}) = \sum_{m=1}^{n_{\text{ray}}} A_m e^{i\phi_m} e^{i \mathbf{k}_m \cdot \mathbf{r}}$

where amplitude, phase, and direction are learnable, and environmental knowledge (e.g., ray paths, reflection coefficients) can be injected via additional neural layers. By construction, the surrogate satisfies the governing PDE (Helmholtz equation) everywhere, admitting controlled extrapolation and interpretability. In benchmark cases, RBNNs achieve RMS errors $<2$ dB (order-of-magnitude improvements over GPR/DNN baselines), and recover physical parameters (e.g., seabed reflection laws, geoacoustic constants) to within a few percent (Li et al., 2022).

Gaussian Process and Encoder-based Models for Environmental Acoustics

Hybrid frameworks for large-scale, real-time digital twins—e.g., 3D oceanic transmission loss—use an additive split:

$\mathrm{TL} = m_{\mathrm{phys}}(R, f) + r(\cdot)$

where $m_{\mathrm{phys}}$ encodes geometric spreading and frequency-dependent absorption, and $r(\cdot)$ is modeled by a sparse variational Gaussian process (SVGP) over latent features extracted from source/receiver/frequency and environmental encoders (e.g., Conv1D over bathymetry). The framework achieves sub-dB bias, RMSEs $\sim15$ dB, and $>800\times$ speedup versus Bellhop3D, with uncertainty quantification directly from the SVGP posterior (Deo et al., 30 Sep 2025).

4. Deep Generative Surrogates for Wave and Channel Fields

Recent advances leverage deep diffusion and generative models for large-scale and high-dimensional acoustic field prediction.

Image-based Surrogates for Helmholtz Solutions

Diffusion models equipped with ControlNet conditioning have been adapted to the Helmholtz equation on the HA30K materials dataset:

Inputs: binary mask images representing obstacle layouts; text encoding of materials and source properties.
Outputs: RGB images of the simulated pressure field (mapped from $\operatorname{Re} p$ ).
Only the ControlNet is trained atop a frozen Stable Diffusion U-Net, minimizing a denoising loss summed across noise levels.
Batching enables evaluation of multiple fields in parallel, achieving $3\times$ – $45\times$ speed-up over finite element solvers. SSIM $\approx0.67$ , FID $\approx41$ , MSE $\approx0.12$ at 50 diffusion steps. Physics constraints are not enforced at inference; generalization is restricted to the training frequency and geometry grid (Gramaccioni et al., 7 Oct 2025).

Conditional Diffusion for Channel Dynamics

StableUASim, a latent diffusion surrogate for time-varying underwater acoustic channels, combines a three-layer Bi-LSTM autoencoder (128-dim latent) with a conditional diffusion model. The forward model is a Markov noising process; the reverse (generation) model is a neural denoiser conditioned on a measurement-derived latent. Pretraining on $10^6$ simulated TVIRs enables rapid adaptation (<25 samples) to new environments. Generated channels reproduce amplitude/phase statistics and BER curves that closely match real measurements, outperforming GAN and stochastic replay baselines. Once adapted, channel sampling is $\sim$ 100 ms per realization on GPU hardware (Li et al., 22 Nov 2025).

5. Data Acquisition, Experimental Pipelines, and Compressed Feature Design

Surrogate fidelity strongly depends on the quality and diversity of training data:

Robotic experimental frameworks combine automated measurement and geometric parameterization for complex diffusive panels. Multi-scale geometry is controlled at macro, meso, and micro levels; normalized cumulative energy curves across hundreds of IRs per panel are extracted via bandpass wavelet frames and used as compressed surrogate targets. Planned modeling includes both dense neural regressors and tree-based architectures, with emphasis on dimensional reduction (PCA, SOM) and data augmentation strategies (Rust et al., 2021).
Inverse problems in acoustics employ surrogates for source localization: a fully-connected NN trained on synthetic PDE solutions enables rapid evaluation of Bayesian posteriors over source positions using MCMC, realizing speed-ups of $10^3$ – $10^4$ over direct solvers. Surrogate model error is propagated through the likelihood; credible intervals reflect the combined measurement and surrogate uncertainty (Ersin et al., 2023).

6. Limitations, Generalization, and Future Directions

Purely data-driven surrogates are limited by the volume and diversity of training data and often lack guaranteed physical consistency, especially when extrapolating beyond the measured or simulated parameter space. Physics-augmented surrogates (RBNN, physics-guided mean functions, operator-embedded NNs) mitigate this risk by leveraging known analytic forms or PDE constraints, yielding improved generalization and interpretability (Li et al., 2022, Deo et al., 30 Sep 2025).
In high-frequency, low-diffraction regimes, ray-based surrogates provide efficient and interpretable architectures; at lower frequencies or in highly reverberant/diffusive systems, alternative bases (e.g., normal modes) or nonparametric ML surrogates become necessary.
Generative diffusion-based surrogates enable scalable, parallelizable field and channel synthesis, but typically require large datasets for pretraining and may not enforce exact physical boundary or constitutive constraints unless explicitly regularized.
Bayesian surrogates and probabilistic ML frameworks enable uncertainty quantification and risk assessment, integrating well with design optimization and early-stage concept screening.
Future advances will likely combine operator learning, generative modeling, uncertainty-aware inference, and modular data acquisition/processing pipelines, targeting truly multi-modal, multi-scale acoustic system design and real-time digital twin deployment across architectural, material, marine, and communications acoustics (Trinh et al., 2017, Prakash et al., 2022, Li et al., 22 Nov 2025).

7. Representative Model Features and Domains

Approach	Target Domain	Physical Knowledge
Polynomial metamodels (Trinh et al., 2017)	Multiscale material design	Moderate (basis expansion)
ML regressors with physics features (Cunha et al., 2022)	Vibroacoustics (STL, transmission)	High (feature engineering)
Bayesian GAMs (Prakash et al., 2022)	Vehicle interior noise (NVH)	Moderate (basis/priors)
RBNN/RCNN (Li et al., 2022)	Underwater field, reverberation	Strong (governing solution)
Neural surrogate for PDE source inversion (Ersin et al., 2023)	Acoustic wave, inverse problem	Weak (learn solution map)
Stable Diffusion + ControlNet (Gramaccioni et al., 7 Oct 2025)	Helmholtz, acoustic fields	None (generative)
StableUASim latent diffusion (Li et al., 22 Nov 2025)	Time-varying channel modeling	Moderate (sim. pretraining)

A plausible implication is that the field is moving toward hybrid frameworks where physical insight is leveraged for sample efficiency and robust extrapolation, while generative and deep learning models enable scalable, end-to-end surrogate prediction and uncertainty-aware optimization.