Feature Enhancement with Deep Feature Losses for Speaker Verification (1910.11905v2)

Published 25 Oct 2019 in eess.AS and cs.SD

Abstract: Speaker Verification still suffers from the challenge of generalization to novel adverse environments. We leverage on the recent advancements made by deep learning based speech enhancement and propose a feature-domain supervised denoising based solution. We propose to use Deep Feature Loss which optimizes the enhancement network in the hidden activation space of a pre-trained auxiliary speaker embedding network. We experimentally verify the approach on simulated and real data. A simulated testing setup is created using various noise types at different SNR levels. For evaluation on real data, we choose BabyTrain corpus which consists of children recordings in uncontrolled environments. We observe consistent gains in every condition over the state-of-the-art augmented Factorized-TDNN x-vector system. On BabyTrain corpus, we observe relative gains of 10.38% and 12.40% in minDCF and EER respectively.

Citations (29)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Feature Enhancement with Deep Feature Losses for Speaker Verification (1910.11905v2)

Summary

Related Papers