Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Microphone Noise Data Augmentation for DNN-based Own Voice Reconstruction for Hearables in Noisy Environments (2312.08908v1)

Published 14 Dec 2023 in eess.AS

Abstract: Hearables with integrated microphones may offer communication benefits in noisy working environments, e.g. by transmitting the recorded own voice of the user. Systems aiming at reconstructing the clean and full-bandwidth own voice from noisy microphone recordings are often based on supervised learning. Recording a sufficient amount of noise required for training such a system is costly since noise transmission between outer and inner microphones varies individually. Previously proposed methods either do not consider noise, only consider noise at outer microphones or assume inner and outer microphone noise to be independent during training, and it is not yet clear whether individualized noise can benefit the training of and own voice reconstruction system. In this paper, we investigate several noise data augmentation techniques based on measured transfer functions to simulate multi-microphone noise. Using augmented noise, we train a multi-channel own voice reconstruction system. Experiments using real noise are carried out to investigate the generalization capability. Results show that incorporating augmented noise yields large benefits, in particular considering individualized noise augmentation leads to higher performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. “Assistive listening headsets for high noise environments: Protection and communication” In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 5753–5757 DOI: 10.1109/ICASSP.2015.7179074
  2. Heming Wang, Xueliang Zhang and DeLiang Wang “Fusing Bone-Conduction and Air-Conduction Sensors for Complex-Domain Speech Enhancement” In IEEE/ACM Trans. on Audio, Speech, and Language Processing 30, 2022, pp. 3134–3143 DOI: 10.1109/TASLP.2022.3209943
  3. “EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones” In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023 DOI: 10.1109/ICASSP49357.2023.10096301
  4. Hung-Ping Liu, Yu Tsao and Chiou-Shann Fuh “Bone-conducted speech enhancement using deep denoising autoencoder” In Speech Communication 104, 2018, pp. 106–112 DOI: 10.1016/j.specom.2018.06.002
  5. “Time-Domain Multi-Modal Bone/Air Conducted Speech Enhancement” In IEEE Signal Processing Letters 27, 2020, pp. 1035–1039 DOI: 10.1109/LSP.2020.3000968
  6. “Deep Multi-Frame MVDR Filtering for Binaural Noise Reduction” In Proc. International Workshop on Acoustic Signal Enhancement (IWAENC), 2022 DOI: 10.1109/IWAENC53105.2022.9914742
  7. Nils L Westhausen and Bernd T Meyer “Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids” In arXiv, 2023 DOI: 10.48550/arXiv.2307.08858
  8. Mattes Ohlenbusch, Christian Rollwage and Simon Doclo “Training Strategies for Own Voice Reconstruction in Hearing Protection Devices Using An In-Ear Microphone” In Proc. International Workshop on Acoustic Signal Enhancement (IWAENC), 2022 DOI: 10.1109/IWAENC53105.2022.9914801
  9. “Dictionary-Based Fusion of Contact and Acoustic Microphones for Wind Noise Reduction” In Proc. International Workshop on Acoustic Signal Enhancement (IWAENC), 2022 DOI: 10.1109/IWAENC53105.2022.9914710
  10. “Multi-modal speech enhancement with bone-conducted speech in time domain” In Applied Acoustics 200, 2022, pp. 109058 DOI: 10.1016/j.apacoust.2022.109058
  11. “Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones” In Sensors 23.1, 2023, pp. 35 DOI: 10.3390/s23010035
  12. “The Hearpiece database of individual transfer functions of an in-the-ear earpiece for hearing device research” In Acta Acustica 5, 2021 DOI: 10.1051/aacus/2020028
  13. “Direction-of-arrival dependency of active noise cancellation headphones” In ASME 2018 Noise Control and Acoustics Division Session presented at INTERNOISE, 2018 DOI: 10.1115/NCAD2018-6120
  14. Rachel E. Bouserhal, Antoine Bernier and Jérémie Voix “An in-ear speech database in varying conditions of the audio-phonation loop” In J. Acoust. Soc. Am. 145.2, 2019, pp. 1069–1077 DOI: 10.1121/1.5091777
  15. “Deep Speech Enhancement Challenge at ICASSP 2023” In arXiv, 2023 DOI: 10.48550/arXiv.2303.11510
  16. “A one-size-fits-all earpiece with multiple microphones and drivers for hearing device research” In Proc. AES International Conference on Headphone Technology, 2019, pp. 1–9
  17. “Data augmentation and loss normalization for deep noise suppression” In Int. Conf. on Speech and Computer (SPECOM) 22, 2020, pp. 79–86 DOI: 10.1007/978-3-030-60276-5_8
  18. “STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency” In IEEE/ACM Trans. on Audio, Speech, and Language Processing 31, 2023, pp. 397–410 DOI: 10.1109/TASLP.2022.3224285
  19. Diederik P Kingma and Jimmy Ba “Adam: A method for stochastic optimization” In Proc. Int. Conf. Learn. Representations, 2015 DOI: 10.48550/arXiv.1412.6980
  20. International Telecommunications Union (ITU) “ITU-T P.862, Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs” Geneva, Switzerland In International Telecommunications Union, 2001
  21. “An algorithm for intelligibility prediction of time–frequency weighted noisy speech” In IEEE Trans. on Audio, Speech, and Language Processing 19.7, 2011, pp. 2125–2136 DOI: 10.1109/TASL.2011.2114881
  22. “Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement” In IEEE/ACM Trans. on Audio, Speech, and Language Processing 31, 2023, pp. 563–575 DOI: 10.1109/TASLP.2022.3221046
Citations (5)

Summary

We haven't generated a summary for this paper yet.