Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Hybrid Approach for Speech Enhancement Using MoG Model and Neural Network Phoneme Classifier (1510.07315v1)

Published 25 Oct 2015 in cs.SD

Abstract: In this paper we present a single-microphone speech enhancement algorithm. A hybrid approach is proposed merging the generative mixture of Gaussians (MoG) model and the discriminative neural network (NN). The proposed algorithm is executed in two phases, the training phase, which does not recur, and the test phase. First, the noise-free speech power spectral density (PSD) is modeled as a MoG, representing the phoneme based diversity in the speech signal. An NN is then trained with phoneme labeled database for phoneme classification with mel-frequency cepstral coefficients (MFCC) as the input features. Given the phoneme classification results, a speech presence probability (SPP) is obtained using both the generative and discriminative models. Soft spectral subtraction is then executed while simultaneously, the noise estimation is updated. The discriminative NN maintain the continuity of the speech and the generative phoneme-based MoG preserves the speech spectral structure. Extensive experimental study using real speech and noise signals is provided. We also compare the proposed algorithm with alternative speech enhancement algorithms. We show that we obtain a significant improvement over previous methods in terms of both speech quality measures and speech recognition results.

Citations (25)

Summary

We haven't generated a summary for this paper yet.