Data-Driven Acoustic Surrogate Models
- Data-driven acoustic surrogate models are computational constructs that approximate complex acoustic phenomena using statistical and physics-guided machine learning techniques.
- They enable rapid evaluation and optimization in applications such as material design, room acoustics, and underwater sound propagation by replacing expensive numerical solvers.
- Incorporating probabilistic frameworks and physics-based constraints, these surrogates deliver high accuracy, reduced computational cost, and enhanced interpretability across diverse acoustic domains.
Data-driven acoustic surrogate models are computational constructs that approximate complex input–output relationships governing acoustic phenomena using data and statistical or machine-learning techniques rather than direct numerical solutions of the governing physical equations. These surrogates enable rapid evaluation, calibration, optimization, and uncertainty quantification of acoustical systems—ranging from material design and wave propagation to room acoustics and vibroacoustic performance—where conventional high-fidelity models (e.g., finite element or boundary element solvers) may be prohibitively computationally expensive or data-intensive. The current landscape of data-driven acoustic surrogates encompasses polynomial metamodels, physics-guided regressors, probabilistic frameworks, physics-embedded neural architectures, and deep generative models, each tailored to different application domains, data regimes, and degrees of available physical knowledge.
1. Polynomial Metamodeling for Multiscale Acoustic Materials
Classical polynomial surrogates, as introduced in the context of acoustical material design, rely on global polynomial expansions in normalized parameter spaces to approximate the map from microstructural descriptors to effective acoustic properties. Given a vector of input parameters describing, for example, porosity or pore shape, the surrogate map is formulated as a sum over multivariate polynomials:
where is the normalized variable, are tensor-product Legendre polynomials, and are coefficient vectors determined by projection via Gauss–Legendre quadrature. For low-dimensional parameter spaces (), total-degree in the range $10$–$15$ yields mean relative errors below across hundreds of independent test cases, with maximal observed errors . The surrogate model reduces online prediction times by orders of magnitude (e.g., from seconds/minutes to sub-millisecond per evaluation), facilitating its deployment in design optimization loops for absorption, impedance, or other transport properties (Trinh et al., 2017).
Extensions to higher-dimensional spaces () employ sparse grids or -minimization to mitigate the curse of dimensionality, and multi-output regression for vector-valued targets. Analytical convergence rates depend on the smoothness of the solution map; exponential if analytic, algebraic otherwise.
2. Machine Learning Surrogates for Vibroacoustics and Room Acoustics
For broader classes of acoustic systems, including sound transmission loss (STL) in vibroacoustics or building room acoustics, machine learning-based surrogates model the mapping from engineered or physical features to frequency- or metric-dependent acoustic outputs.
Vibroacoustic Transmission Loss
Benchmarking multiple machine learning regressors (fully-connected neural networks, Gaussian process regressors, random forests, gradient boosting) for STL, the best performing surrogates incorporate engineering features encoding mass law, bending stiffness, and modal density:
- Inputs: material density, Young’s modulus, thickness, damping ratio, geometric layout, and derived features.
- Outputs: STL curves over frequency bands.
- Training: Latin-hypercube sampling (LHS) of input space, MSE loss, regularization, cross-validation.
- Physics-guided feature engineering lowers RMSE from dB to $0.12$ dB (analytic benchmarks), band-averaging further reduces errors for numerical models.
- NN surrogates yield RMSE < 3 dB across STL models except full FEM, where errors stabilize at 5 dB for .
- RF and GBT offer rapid training with intrinsic feature importance diagnostics and interpretability at slight cost to predictive accuracy (Cunha et al., 2022).
Room Acoustic Performance
For early-stage building design, surrogate DNNs predict core room-acoustic metrics—, , , , and —from geometric and material input vectors. Networks are shallow FC architectures ($5$–$9$ layers, $50$–$150$ units/layer, ReLU activations, dropout), trained on $2916$ simulation-based room configurations. Mean absolute percentage errors of $1$– on test, $2$– on validation indicate robust in-domain prediction. Categorical variables (e.g., shading, furniture) are one-hot encoded; continuous features are normalized (Abarghooie et al., 2021).
3. Physics-Guided and Probabilistic Surrogate Modeling
Surrogate construction benefits from embedding physical models or constraints, especially in settings with structured uncertainties or scarce data.
Bayesian Surrogates for Noise and Transfer Functions
For vehicle NVH (noise, vibration, harshness), Bayesian generalized additive models (GAMs) estimate the sound pressure level inside cabins as a function of speed, frequency, and categorical design variables. The surrogate combines physics-motivated basis expansions (polynomial or Gaussian), hierarchical priors, and heteroscedastic error models. Bayesian inference via NUTS (PyMC3) yields predictive means and credible intervals; parametric bootstrap augments validation. Cross-validation is , and analytic evaluation requires milliseconds (Prakash et al., 2022).
Physics-Aided Ray-based Surrogates
In data-constrained underwater acoustics, the Ray-Basis Neural Network (RBNN) architecture parameterizes the sound field as a sparse sum over exact high-frequency solutions (plane or spherical waves):
where amplitude, phase, and direction are learnable, and environmental knowledge (e.g., ray paths, reflection coefficients) can be injected via additional neural layers. By construction, the surrogate satisfies the governing PDE (Helmholtz equation) everywhere, admitting controlled extrapolation and interpretability. In benchmark cases, RBNNs achieve RMS errors dB (order-of-magnitude improvements over GPR/DNN baselines), and recover physical parameters (e.g., seabed reflection laws, geoacoustic constants) to within a few percent (Li et al., 2022).
Gaussian Process and Encoder-based Models for Environmental Acoustics
Hybrid frameworks for large-scale, real-time digital twins—e.g., 3D oceanic transmission loss—use an additive split:
where encodes geometric spreading and frequency-dependent absorption, and is modeled by a sparse variational Gaussian process (SVGP) over latent features extracted from source/receiver/frequency and environmental encoders (e.g., Conv1D over bathymetry). The framework achieves sub-dB bias, RMSEs dB, and speedup versus Bellhop3D, with uncertainty quantification directly from the SVGP posterior (Deo et al., 30 Sep 2025).
4. Deep Generative Surrogates for Wave and Channel Fields
Recent advances leverage deep diffusion and generative models for large-scale and high-dimensional acoustic field prediction.
Image-based Surrogates for Helmholtz Solutions
Diffusion models equipped with ControlNet conditioning have been adapted to the Helmholtz equation on the HA30K materials dataset:
- Inputs: binary mask images representing obstacle layouts; text encoding of materials and source properties.
- Outputs: RGB images of the simulated pressure field (mapped from ).
- Only the ControlNet is trained atop a frozen Stable Diffusion U-Net, minimizing a denoising loss summed across noise levels.
- Batching enables evaluation of multiple fields in parallel, achieving – speed-up over finite element solvers. SSIM , FID , MSE at 50 diffusion steps. Physics constraints are not enforced at inference; generalization is restricted to the training frequency and geometry grid (Gramaccioni et al., 7 Oct 2025).
Conditional Diffusion for Channel Dynamics
StableUASim, a latent diffusion surrogate for time-varying underwater acoustic channels, combines a three-layer Bi-LSTM autoencoder (128-dim latent) with a conditional diffusion model. The forward model is a Markov noising process; the reverse (generation) model is a neural denoiser conditioned on a measurement-derived latent. Pretraining on simulated TVIRs enables rapid adaptation (<25 samples) to new environments. Generated channels reproduce amplitude/phase statistics and BER curves that closely match real measurements, outperforming GAN and stochastic replay baselines. Once adapted, channel sampling is 100 ms per realization on GPU hardware (Li et al., 22 Nov 2025).
5. Data Acquisition, Experimental Pipelines, and Compressed Feature Design
Surrogate fidelity strongly depends on the quality and diversity of training data:
- Robotic experimental frameworks combine automated measurement and geometric parameterization for complex diffusive panels. Multi-scale geometry is controlled at macro, meso, and micro levels; normalized cumulative energy curves across hundreds of IRs per panel are extracted via bandpass wavelet frames and used as compressed surrogate targets. Planned modeling includes both dense neural regressors and tree-based architectures, with emphasis on dimensional reduction (PCA, SOM) and data augmentation strategies (Rust et al., 2021).
- Inverse problems in acoustics employ surrogates for source localization: a fully-connected NN trained on synthetic PDE solutions enables rapid evaluation of Bayesian posteriors over source positions using MCMC, realizing speed-ups of – over direct solvers. Surrogate model error is propagated through the likelihood; credible intervals reflect the combined measurement and surrogate uncertainty (Ersin et al., 2023).
6. Limitations, Generalization, and Future Directions
- Purely data-driven surrogates are limited by the volume and diversity of training data and often lack guaranteed physical consistency, especially when extrapolating beyond the measured or simulated parameter space. Physics-augmented surrogates (RBNN, physics-guided mean functions, operator-embedded NNs) mitigate this risk by leveraging known analytic forms or PDE constraints, yielding improved generalization and interpretability (Li et al., 2022, Deo et al., 30 Sep 2025).
- In high-frequency, low-diffraction regimes, ray-based surrogates provide efficient and interpretable architectures; at lower frequencies or in highly reverberant/diffusive systems, alternative bases (e.g., normal modes) or nonparametric ML surrogates become necessary.
- Generative diffusion-based surrogates enable scalable, parallelizable field and channel synthesis, but typically require large datasets for pretraining and may not enforce exact physical boundary or constitutive constraints unless explicitly regularized.
- Bayesian surrogates and probabilistic ML frameworks enable uncertainty quantification and risk assessment, integrating well with design optimization and early-stage concept screening.
- Future advances will likely combine operator learning, generative modeling, uncertainty-aware inference, and modular data acquisition/processing pipelines, targeting truly multi-modal, multi-scale acoustic system design and real-time digital twin deployment across architectural, material, marine, and communications acoustics (Trinh et al., 2017, Prakash et al., 2022, Li et al., 22 Nov 2025).
7. Representative Model Features and Domains
| Approach | Target Domain | Physical Knowledge |
|---|---|---|
| Polynomial metamodels (Trinh et al., 2017) | Multiscale material design | Moderate (basis expansion) |
| ML regressors with physics features (Cunha et al., 2022) | Vibroacoustics (STL, transmission) | High (feature engineering) |
| Bayesian GAMs (Prakash et al., 2022) | Vehicle interior noise (NVH) | Moderate (basis/priors) |
| RBNN/RCNN (Li et al., 2022) | Underwater field, reverberation | Strong (governing solution) |
| Neural surrogate for PDE source inversion (Ersin et al., 2023) | Acoustic wave, inverse problem | Weak (learn solution map) |
| Stable Diffusion + ControlNet (Gramaccioni et al., 7 Oct 2025) | Helmholtz, acoustic fields | None (generative) |
| StableUASim latent diffusion (Li et al., 22 Nov 2025) | Time-varying channel modeling | Moderate (sim. pretraining) |
A plausible implication is that the field is moving toward hybrid frameworks where physical insight is leveraged for sample efficiency and robust extrapolation, while generative and deep learning models enable scalable, end-to-end surrogate prediction and uncertainty-aware optimization.