Hardness of Joint Fine-Tuning the eGeMAPS Estimator with Enhancement Models
Determine whether jointly fine-tuning the eGeMAPS estimator together with the speech enhancement model (such as Demucs or FullSubNet) creates a harder optimization problem than fine-tuning the enhancement model while keeping the eGeMAPS estimator fixed, and clarify whether joint fine-tuning can add robustness to enhanced speech inputs.
References
We hypothesized that fine-tuning the estimator could add robustness to enhanced speech as input, but we conjecture that it creates a harder optimization problem.
— Improving Speech Enhancement through Fine-Grained Speech Characteristics
(2207.00237 - Yang et al., 2022) in Section 4.4 (Ablation Study of eGeMAPS Estimator)