Do EER/WER successes predict downstream speech generation performance?
Determine whether speaker anonymization systems that achieve strong performance according to equal error rate (EER) and word error rate (WER) metrics also excel when their anonymized speech is used as training data for downstream speech generation tasks.
References
On the other hand, evaluating these SA systems in the context of speech generation model training has not yet been investigated, and it is unknown whether an SA system that performs well in terms of EER and WER can also excel in the downstream speech generation task.
— Multi-speaker Text-to-speech Training with Speaker Anonymized Data
(2405.11767 - Huang et al., 20 May 2024) in Section 1 (Introduction)