Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
88 tokens/sec
Gemini 2.5 Pro Premium
46 tokens/sec
GPT-5 Medium
16 tokens/sec
GPT-5 High Premium
19 tokens/sec
GPT-4o
95 tokens/sec
DeepSeek R1 via Azure Premium
90 tokens/sec
GPT OSS 120B via Groq Premium
476 tokens/sec
Kimi K2 via Groq Premium
234 tokens/sec
2000 character limit reached

CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR (2502.20040v1)

Published 27 Feb 2025 in eess.AS, cs.AI, and cs.SD

Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance. The proposed network takes as input the noisy and reverberant microphone recording and predicts the corresponding clean Mel-spectrogram. The enhanced Mel-spectrogram can be either transformed to speech waveform with a neural vocoder or directly used for ASR. The proposed network is composed of interleaved cross-band and narrow-band processing in the Mel-frequency domain, for learning the full-band spectral pattern and the narrow-band properties of signals, respectively. Compared to linear-frequency domain or time-domain speech enhancement, the key advantage of Mel-spectrogram enhancement is that Mel-frequency presents speech in a more compact way and thus is easier to learn, which will benefit both speech quality and ASR. Experimental results on four English and one Chinese datasets demonstrate a significant improvement in both speech quality and ASR performance achieved by the proposed model. Code and audio examples of our model are available online in https://audio.westlake.edu.cn/Research/CleanMel.html.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube