Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improved Speaker-Dependent Separation for CHiME-5 Challenge (1904.03792v1)

Published 8 Apr 2019 in eess.AS and cs.SD

Abstract: This paper summarizes several follow-up contributions for improving our submitted NWPU speaker-dependent system for CHiME-5 challenge, which aims to solve the problem of multi-channel, highly-overlapped conversational speech recognition in a dinner party scenario with reverberations and non-stationary noises. We adopt a speaker-aware training method by using i-vector as the target speaker information for multi-talker speech separation. With only one unified separation model for all speakers, we achieve a 10\% absolute improvement in terms of word error rate (WER) over the previous baseline of 80.28\% on the development set by leveraging our newly proposed data processing techniques and beamforming approach. With our improved back-end acoustic model, we further reduce WER to 60.15\% which surpasses the result of our submitted CHiME-5 challenge system without applying any fusion techniques.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jian Wu (314 papers)
  2. Yong Xu (432 papers)
  3. Shi-Xiong Zhang (48 papers)
  4. Lian-Wu Chen (2 papers)
  5. Meng Yu (65 papers)
  6. Lei Xie (337 papers)
  7. Dong Yu (329 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.