Papers
Topics
Authors
Recent
2000 character limit reached

CLCNet: Deep learning-based Noise Reduction for Hearing Aids using Complex Linear Coding

Published 28 Jan 2020 in eess.AS, cs.LG, cs.SD, and stat.ML | (2001.10218v1)

Abstract: Noise reduction is an important part of modern hearing aids and is included in most commercially available devices. Deep learning-based state-of-the-art algorithms, however, either do not consider real-time and frequency resolution constrains or result in poor quality under very noisy conditions. To improve monaural speech enhancement in noisy environments, we propose CLCNet, a framework based on complex valued linear coding. First, we define complex linear coding (CLC) motivated by linear predictive coding (LPC) that is applied in the complex frequency domain. Second, we propose a framework that incorporates complex spectrogram input and coefficient output. Third, we define a parametric normalization for complex valued spectrograms that complies with low-latency and on-line processing. Our CLCNet was evaluated on a mixture of the EUROM database and a real-world noise dataset recorded with hearing aids and compared to traditional real-valued Wiener-Filter gains.

Citations (20)

Summary

  • The paper introduces CLCNet that uses complex linear coding to enhance noise reduction in hearing aids.
  • It employs a deep learning MLP architecture with a dual-objective loss (RMSE and SI-SDR) to preserve speech harmonics and achieve low latency.
  • Experimental results show superior performance over traditional methods, especially in low SNR scenarios.

CLCNet: Deep Learning-Based Noise Reduction for Hearing Aids Using Complex Linear Coding

Introduction

The research presents CLCNet, a noise reduction framework for hearing aids utilizing complex linear coding (CLC) optimized for low-resolution spectrograms and real-time processing. Unlike traditional algorithms which may not address real-time processing constraints, CLCNet proposes a complex-valued approach inspired by linear predictive coding (LPC). The system is designed to tackle monaural speech enhancement in noisy settings, leveraging complex frequency domain operations to improve noise reduction outcomes.

Complex Linear Coding Framework

The core innovation of CLCNet lies in its complex linear coding strategy, which extends LPC principles into the complex frequency domain. This approach is structured to manipulate low-resolution spectrograms where one frequency bin may encapsulate multiple speech harmonics, often leading to challenges in separating speech from background noise. The framework proposes a generalized version of LPC in the frequency domain, enabling noise reduction through complex-valued transformations.

CLCNet incorporates a parameterized normalization method for complex spectrograms, maintaining phase integrity and allowing scalability to hearing aid applications. This enables the system to efficiently process input data while adhering to low-latency requirements necessary for practical hearing aid deployment. Figure 1

Figure 1: A power spectrogram using the deployed filter bank from a clean speech sample of the train set.

Implementation Details

The CLCNet architecture introduces a deep learning model implemented in PyTorch, adopting a multilayer perceptron (MLP) with specific adaptations for processing complex filter bank representations. The model uses a temporal context to predict complex-valued coefficients, incorporating both past and future frames to optimize the noise reduction process.

Training is enhanced by using a dual-objective loss function combining root mean square error (RMSE) and scale-invariant signal-to-noise distortion ratio (SI-SDR). This combined loss assists in preserving speech harmonics and curbing unnecessary noise reduction, critical for maintaining the intelligibility of speech signals post-enhancement.

Experimental Evaluation

Experiments showcase the capabilities of CLCNet in improving speech signal quality in noisy environments, specifically in terms of SI-SDR and short-time objective intelligibility (STOI) metrics. The system demonstrates significant improvements over traditional Wiener-Filter approaches, particularly for scenarios with low signal-to-noise ratios (SNRs), where complex linear coding is particularly efficacious. Figure 2

Figure 2: Detail view of Mel spectrograms illustrating noise reduction performance.

Discussion

Through objective metrics and detailed spectrogram comparisons, CLCNet exhibits strong performance in noise reduction tasks, maintaining its effectiveness across varying SNRs. The results indicate that leveraging complex-valued transformations, alongside parametric normalization, enables robust speech enhancement suitable for real-time applications in hearing aids.

Conclusion

CLCNet offers a novel approach to noise reduction using complex linear coding tailored for hearing aids. By addressing real-time constraints and enhancing spectrogram resolution, this framework sets a new standard in hearing aid signal processing, emphasizing both theoretical advancements and practical application benefits. Future research may explore further optimization of complex coefficient prediction and extended applications across diverse auditory enhancement scenarios.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.