Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers (2306.17317v1)

Published 29 Jun 2023 in eess.AS and cs.SD

Abstract: This paper introduces a novel low-latency online beamforming (BF) algorithm, named Modified Parametric Multichannel Wiener Filter (Mod-PMWF), for enhancing speech mixtures with unknown and varying number of speakers. Although conventional BFs such as linearly constrained minimum variance BF (LCMV BF) can enhance a speech mixture, they typically require such attributes of the speech mixture as the number of speakers and the acoustic transfer functions (ATFs) from the speakers to the microphones. When the mixture attributes are unavailable, estimating them by low-latency processing is challenging, hindering the application of the BFs to the problem. In this paper, we overcome this problem by modifying a conventional Parametric Multichannel Wiener Filter (PMWF). The proposed Mod-PMWF can adaptively form a directivity pattern that enhances all the speakers in the mixture without explicitly estimating these attributes. Our experiments will show the proposed BF's effectiveness in interference reduction ratios and subjective listening tests.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. “Beamforming: A versatile approach to spatial filtering” In IEEE ASSP Magazine 5.2, 1988, pp. 4–24
  2. H.L.V. Trees “Optimum Array Processing, Part IV of Detection, Estimation, and Modulation Theory” New York: Wiley-Interscience, 2002
  3. “Multichannel Signal Ennhancement Algorithms for Assisted Listening Devices” In IEEE Signal Processing Magazine 32.2, 2015, pp. 18–30
  4. “A robust adaptive binaural beamformer for hearing devices” In Proc. Asilomar Conference on Signals, Systems, and Computers, 2017
  5. S. Gannot, D. Burshtein and E. Weinstein “Signal Enhancement Using Beamforming and Non-Stationarity with Applications to Speech” In IEEE Trans. Signal Processing 49.8, 2001, pp. 1614–1626
  6. S. Markovich-Golan, S. Gannot and I. Cohen “Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals” In IEEE Trans. ASLP 17.6, 2009, pp. 1071–1086
  7. M. Souden, J. Benesty and S. Affes “On optimal frequency-domain multichannel linear filtering for noise reduction” In IEEE Trans. Audio, Speech, and Language Processing 18.2, 2007, pp. 260–276
  8. “Multichannel end-to-end speech recognition” In Proc. International conference on machine learning, 2017, pp. 2632–2641
  9. “Derivative constraints for broadband element space antenna array processors” In IEEE Transactions on Acoustics, Speech, Signal Processing 31.6, 1983, pp. 1378–1393
  10. Ofer Schwartz, Sharon Gannot and Emanuel A.P. Habets “Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction” In IEEE/ACM Trans. Audio, Speech, and Language Processing 25.5, 2017
  11. Adrian Herzog and Emanuël A.P. Habets “Direction and Reverberation Preserving Noise Reduction of Ambisonics Signals” In IEEE/ACM Trans. Audio, Speech, and Language Processing 28, 2020
  12. “Independent Vector Analysis with More Microphones Than Sources” In Proc. IEEE WASPAA, 2019
  13. Rintaro Ikeshita, Tomohiro Nakatani and Shoko Araki “Block Coordinate Descent Algorithms for Auxiliary-function-based Independent Vector Extraction” In IEEE Trans. Signal Processing 69, 2021, pp. 3252–3267
  14. “Low Latency Online Source Separation and Noise Reduction Based on Joint Optimization with Dereverberation” In Proc. European Signal Processing Conference (EUSIPCO), 2021, pp. 1000–1004 DOI: 10.23919/EUSIPCO54536.2021.9616119
  15. Ann Spriet, Marc Moonen and Jan Wouters “Robustness Analysis of Multichannel Wiener Filtering and Generalized Sidelobe Cancellation for Multimicrophone Noise Reduction in Hearing Aid Applications” In IEEE Transactions on Speech and Audio Processing 13.4, 2005, pp. 487–503
  16. “ITU-R recommendation BS.1534”
  17. “Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction” In Speech Communication 49, 2007, pp. 636–656
  18. Alexander Krueger, Ernst Warsits and Reinhold Haeb-Umbach “Speech Enhancement with a GSC-Like Structure Employing Eigenvector-Based Transfer Function Ratio Estimation” In IEEE Trans. Audio, Speech, and Language Processing 19.1, 2011, pp. 206–219
  19. “Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer” In Proc. Interspeech 2019, 2019, pp. 111–115 DOI: 10.21437/Interspeech.2019-1286
  20. “Blind Acoustic Beamforming Based on Generalized Eigenvalue Decomposition” In IEEE Transactions on Audio, Speech, and Language Processing 15.5, 2007
  21. S. Araki, H. Sawada and S. Makino “BLIND SPEECH SEPARATION IN A MEETING SITUATION WITH MAXIMUM SNR BEAMFORMER” In Proc. IEEE ICASSP, 2007, pp. 41–44
  22. J. Heymann, L. Drude and R. Haeb-Umbach “Neural network based spectral mask estimation for acoustic beamforming” In Proc. IEEE ICASSP, 2016, pp. 196–200
  23. “Blind separation of speech mixtures via time-frequency masking” In IEEE Transactions on Signal Processing 52.7, 2004, pp. 1830–1847
  24. D. Wang “On ideal binary mask as the computational goal of auditory scene analysis” In Speech Separation by Humans and Machines, 2005, pp. 181–197
  25. “A multichannel MMSE-based framework for speech source separation and noise reduction” In IEEE Trans. Audio, Speech, and Language Processing 21.9, 2013, pp. 1913–1928
  26. “ATR Japanese speech database as a tool of speech recognition and synthesis” In Speech communication 9.4, 1990, pp. 357–363
  27. “Noise power spectral density tracking: A maximum likelihood perspective” In IEEE Signal Processing Letters 19.8, 2012, pp. 495–498
  28. John H.L. Hansen and Bryan L. Pellom “An Effective Quality Evaluation Protocol For Speech Enhancement Algorithms” In Proc. International conference on spoken language processing, 1998, pp. 2819–2822
  29. “Evaluation of objective quality measures for speech enhancement” In IEEE Trans. Audio, Speech, and Language Processing 16.1, 2008, pp. 229–238

Summary

We haven't generated a summary for this paper yet.