Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
86 tokens/sec
Gemini 2.5 Pro Premium
40 tokens/sec
GPT-5 Medium
27 tokens/sec
GPT-5 High Premium
32 tokens/sec
GPT-4o
94 tokens/sec
DeepSeek R1 via Azure Premium
94 tokens/sec
GPT OSS 120B via Groq Premium
469 tokens/sec
Kimi K2 via Groq Premium
198 tokens/sec
2000 character limit reached

A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement (2506.01023v1)

Published 1 Jun 2025 in cs.SD, cs.AI, and eess.AS

Abstract: This paper proposes a model that integrates sub-band processing and deep filtering to fully exploit information from the target time-frequency (TF) bin and its surrounding TF bins for single-channel speech enhancement. The sub-band module captures surrounding frequency bin information at the input, while the deep filtering module applies filtering at the output to both the target TF bin and its surrounding TF bins. To further improve the model performance, we decouple deep filtering into temporal and frequency components and introduce a two-stage framework, reducing the complexity of filter coefficient prediction at each stage. Additionally, we propose the TAConv module to strengthen convolutional feature extraction. Experimental results demonstrate that the proposed hierarchical deep filtering network (HDF-Net) effectively utilizes surrounding TF bin information and outperforms other advanced systems while using fewer resources.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.