Generalization of lower-parameterized diffusion-based speech enhancement models
Determine whether diffusion-based speech enhancement models that employ lower-parameterized neural networks—such as compact 2D convolutional UNet variants used as score models—can maintain the strong generalization to unseen noisy speech conditions that has been observed for larger-parameter score-based generative models (e.g., SGMSE/SGMSE+).
References
However, it remains unclear if a lower-parameterized NN still results in the good generalization to unseen data reported in [richter_sgmse].
— Diffusion Buffer for Online Generative Speech Enhancement
(2510.18744 - Lay et al., 21 Oct 2025) in Introduction (Section 1), Footnote after the 'Video and code' link