Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training (1810.09067v2)

Published 22 Oct 2018 in cs.SD, cs.MM, and eess.AS

Abstract: In recent years, monaural speech separation has been formulated as a supervised learning problem, which has been systematically researched and shown the dramatical improvement of speech intelligibility and quality for human listeners. However, it has not been well investigated whether the methods can be employed as the front-end processing and directly improve the performance of a machine listener, i.e., an automatic speech recognizer, without retraining or joint-training the acoustic model. In this paper, we explore the effectiveness of the independent front-end processing for the multi-conditional trained ASR on the CHiME-3 challenge. We find that directly feeding the enhanced features to ASR can make 36.40% and 11.78% relative WER reduction for the GMM-based and DNN-based ASR respectively. We also investigate the affect of noisy phase and generalization ability under unmatched noise condition.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Zhihao Du (30 papers)
  2. Xueliang Zhang (39 papers)
  3. Jiqing Han (26 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.