Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise Suppression (2404.11621v1)

Published 8 Apr 2024 in eess.AS

Abstract: In recent years, the introduction of neural networks (NNs) into the field of speech enhancement has brought significant improvements. However, many of the proposed methods are quite demanding in terms of computational complexity and memory footprint. For the application in dedicated communication devices, such as speakerphones, hands-free car systems, or smartphones, efficiency plays a major role along with performance. In this context, we present an efficient, high-performance hybrid joint acoustic echo control and noise suppression system, whereby our main contribution is the postfilter NN, performing both noise and residual echo suppression. The preservation of nearend speech is improved by a Bark-scale auditory filterbank for the NN postfilter. The proposed hybrid method is benchmarked with state-of-the-art methods and its effectiveness is demonstrated on the ICASSP 2023 AEC Challenge blind test set. We demonstrate that it offers high-quality nearend speech preservation during both double-talk and nearend speech conditions. At the same time, it is capable of efficient removal of echo leaks, achieving a comparable performance to already small state-of-the-art models such as the end-to-end DeepVQE-S, while requiring only around 10 % of its computational complexity. This makes it easily realtime implementable on a speakerphone device.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. “Meeting Effectiveness and Inclusiveness in Remote Collaboration,” Proc. ACM Hum.-Comput. Interact., vol. 5, Apr. 2021.
  2. “Y22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT-Net FCRN for Acoustic Echo and Noise Suppression,” in Proc. of Interspeech, Brno, Czech Republic, Oct. 2021, pp. 4763–4767.
  3. “AEC in A Netshell: On Target and Topology Choices for FCRN Acoustic Echo Cancellation,” in Proc. of ICASSP, Toronto, Canada, June 2021, pp. 156–160.
  4. S. Braun and M. Valero, “Task Splitting for DNN-Based Acoustic Echo and Noise Removal,” in Proc. of IWAENC, Bamberg, Germany, Sept. 2022, pp. 386–390.
  5. “Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering,” in Proc. of IWAENC, Bamberg, Germany, Sept. 2022, pp. 406–410.
  6. “Efficient Deep Acoustic Echo Suppression with Condition-Aware Training,” in Proc. of WASPAA, New Paltz, NY, USA, Oct. 2023, pp. 1–5.
  7. “DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation,” in Proc. of Interspeech, Dublin, Ireland, Aug. 2023, pp. 1–5.
  8. “SCA: Streaming Cross-Attention Alignment For Echo Cancellation,” in Proc. of ICASSP, Rhodes Island, Greece, June 2023, pp. 1–5.
  9. “Deep Model with Built-In Cross-Attention Alignment for Acoustic Echo Cancellation,” arXiv:2208.11308, Aug. 2023.
  10. “Acoustic Echo Cancellation with the Dual-Signal Transformation LSTM Network,” in Proc. of ICASSP, Toronto, Canada, June 2021, pp. 7138–7142.
  11. H. Zhang and D. Wang, “Neural Cascade Architecture for Joint Acoustic Echo and Noise Suppression,” in Proc. of ICASSP, Singapore, May 2022, pp. 671–675.
  12. G. Enzner and P. Vary, “Frequency-Domain Adaptive Kalman Filter for Acoustic Echo Control in Hands-Free Telephones,” Signal Processing, vol. 86, no. 6, pp. 1140–1156, June 2006.
  13. “Hands-Free System with Low-Delay Subband Acoustic Echo Control and Noise Reduction,” in Proc. of ICASSP, Las Vegas, NV, USA, Apr. 2008, pp. 1521–1524.
  14. “On Training a Neural Residual Acoustic Echo Suppressor for Improved ASR,” in Proc. of Interspeech, Dublin, Ireland, Aug. 2023, pp. 4019–4023.
  15. “Acoustic Echo Cancellation with Cross-Domain Learning,” in Proc. of Interspeech, Brno, Czech Republic, Oct. 2021, pp. 4753–4757.
  16. “Acoustic Echo Cancellation with the Normalized Sign-Error Least Mean Squares Algorithm and Deep Residual Echo Suppression,” Algorithms, vol. 16, no. 3, Mar. 2023.
  17. “Low-Complexity, Real-Time Joint Neural Echo Control and Speech Enhancement Based On Percepnet,” in Proc. of ICASSP, Toronto, Canada, June 2021, pp. 7133–7137.
  18. “A Synergistic Kalman- and Deep Postfiltering Approach to Acoustic Echo Cancellation,” in Proc. of EUSIPCO, Dublin, Ireland, Aug. 2021, pp. 990–994.
  19. “Low-Complexity Acoustic Echo Cancellation with Neural Kalman Filtering,” in Proc. of ICASSP, Rhodes Island, Greece, June 2023, pp. 1–5.
  20. “Acoustic Echo Cancellation Signal Processing Grand Challenge 2023,” in Proc. of ICASSP, Rhodes Island, Greece, June 2023, pp. 1–5.
  21. “A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech,” arXiv:2008.04259, Aug. 2020.
  22. “Design of Near Perfect Reconstruction Oversampled Filter Banks for Subband Adaptive Filters,” IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol. 46, no. 8, pp. 1081–1085, Aug. 1999.
  23. “An Optimized NLMS Algorithm for Acoustic Echo Cancellation,” in Proc. of ISSCS, Iasi, Romania, July 2015, pp. 1–4.
  24. S. Braun and I. Tashev, “Data Augmentation and Loss Normalization for Deep Noise Suppression,” in Speech and Computer. Sept. 2020, pp. 79–86, Springer International Publishing.
  25. P. Kabal, An Examination and Interpretation of ITU-R BS.1387: Perceptual Evaluation of Audio Quality, Ph.D. thesis, Department of Electrical & Computer Engineering, McGill University, Dec. 2003.
  26. “Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation,” ACM Trans. Graph., vol. 37, no. 4, July 2018.
  27. “Differentiable Consistency Constraints for Improved Deep Speech Enhancement,” in Proc. of ICASSP, Brighton, UK, May 2019, pp. 900–904.
  28. “ICASSP 2023 Deep Noise Suppression Challenge,” arXiv:2303.11510, Mar. 2023.
  29. “A Study on Data Augmentation of Reverberant Speech for Robust Speech Recognition,” in Proc. of ICASSP, New Orleans, LA, USA, Mar. 2017, pp. 5220–5224.
  30. H. Zhang and D. Wang, “Neural Cascade Architecture for Multi-Channel Acoustic Echo Suppression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2326–2336, July 2022.
  31. D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” in Proc. of ICLR, San Diego, CA, USA, May 2015, pp. 1–15.
  32. “AECMOS: A Speech Quality Assessment Metric for Echo Impairment,” in Proc. of ICASSP, Singapore, May 2022, pp. 901–905.
  33. “Acoustic Echo Control,” in Academic Press Library in Signal Processing, vol. 4, pp. 807–877. Elsevier/Academic Press, 2013.
  34. “DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors,” in Proc. of ICASSP, Singapore, May 2022, pp. 886–890.
  35. “Crowdsourcing Approach for Subjective Evaluation of Echo Impairment,” in Proc. of ICASSP, Toronto, Canada, June 2021, pp. 406–410.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com