Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement (2203.12188v2)

Published 23 Mar 2022 in cs.SD, cs.AI, and eess.AS

Abstract: Previously proposed FullSubNet has achieved outstanding performance in Deep Noise Suppression (DNS) Challenge and attracted much attention. However, it still encounters issues such as input-output mismatch and coarse processing for frequency bands. In this paper, we propose an extended single-channel real-time speech enhancement framework called FullSubNet+ with following significant improvements. First, we design a lightweight multi-scale time sensitive channel attention (MulCA) module which adopts multi-scale convolution and channel attention mechanism to help the network focus on more discriminative frequency bands for noise reduction. Then, to make full use of the phase information in noisy speech, our model takes all the magnitude, real and imaginary spectrograms as inputs. Moreover, by replacing the long short-term memory (LSTM) layers in original full-band model with stacked temporal convolutional network (TCN) blocks, we design a more efficient full-band module called full-band extractor. The experimental results in DNS Challenge dataset show the superior performance of our FullSubNet+, which reaches the state-of-the-art (SOTA) performance and outperforms other existing speech enhancement approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jun Chen (374 papers)
  2. Zilin Wang (30 papers)
  3. Deyi Tuo (7 papers)
  4. Zhiyong Wu (171 papers)
  5. Shiyin Kang (27 papers)
  6. Helen Meng (204 papers)
Citations (98)

Summary

We haven't generated a summary for this paper yet.