Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Model-Based Speech Enhancement in the Modulation Domain (1707.02651v3)

Published 9 Jul 2017 in cs.SD

Abstract: This paper presents an algorithm for modulation-domain speech enhancement using a Kalman filter. The proposed estimator jointly models the estimated dynamics of the spectral amplitudes of speech and noise to obtain an MMSE estimation of the speech amplitude spectrum with the assumption that the speech and noise are additive in the complex domain. In order to include the dynamics of noise amplitudes with those of speech amplitudes, we propose a statistical "Gaussring" model that comprises a mixture of Gaussians whose centers lie in a circle on the complex plane. The performance of the proposed algorithm is evaluated using the perceptual evaluation of speech quality measure, segmental SNR measure, and short-time objective intelligibility measure. For speech quality measures, the proposed algorithm is shown to give a consistent improvement over a wide range of SNRs when compared to competitive algorithms. Speech recognition experiments also show that the Gaussring-model-based algorithm performs well for two types of noise.

Citations (24)

Summary

We haven't generated a summary for this paper yet.