Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Meta-AF Echo Cancellation for Improved Keyword Spotting (2312.10605v1)

Published 17 Dec 2023 in cs.SD and eess.AS

Abstract: Adaptive filters (AFs) are vital for enhancing the performance of downstream tasks, such as speech recognition, sound event detection, and keyword spotting. However, traditional AF design prioritizes isolated signal-level objectives, often overlooking downstream task performance. This can lead to suboptimal performance. Recent research has leveraged meta-learning to automatically learn AF update rules from data, alleviating the need for manual tuning when using simple signal-level objectives. This paper improves the Meta-AF framework by expanding it to support end-to-end training for arbitrary downstream tasks. We focus on classification tasks, where we introduce a novel training methodology that harnesses self-supervision and classifier feedback. We evaluate our approach on the combined task of acoustic echo cancellation and keyword spotting. Our findings demonstrate consistent performance improvements with both pre-trained and joint-trained keyword spotting models across synthetic and real playback. Notably, these improvements come without requiring additional tuning, increased inference-time complexity, or reliance on oracle signal-level training data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. “Meta-AF: Meta-Learning for Adaptive Filters,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2022.
  2. “Speech Processing for Digital Home Assistants: Combining Signal Processing With Deep-Learning Techniques,” IEEE Signal Processing Magazine (SPM), 2019.
  3. Adaptive Signal Processing, Prentice-Hall, 1985.
  4. V. John Mathews, “Adaptive Polynomial Filters,” IEEE Signal Processing Magazine (SPM), 1991.
  5. Simon S. Haykin, Adaptive Filter Theory, Pearson, 2008.
  6. “NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers,” arXiv:2112.04613, 2021.
  7. “Deep Adaptation Control for Acoustic Echo Cancellation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022.
  8. “End-To-End Deep Learning-Based Adaptation Control for Frequency-Domain Adaptive System Identification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022.
  9. “Auto-DSP: Learning to Optimize Acoustic Echo Cancellers,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021.
  10. “Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition,” IEEE Transactions on Speech and Audio Processing (TSAP), 2004.
  11. “Phase-Based Dual-Microphone Speech Enhancement Using a Prior Speech Model,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2006.
  12. “Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2007.
  13. “Beamnet: End-To-End Training of a Beamformer-Supported Multi-Channel Asr System,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017.
  14. “Unified Architecture for Multichannel End-To-End Speech Recognition With Neural Beamforming,” IEEE Journal of Selected Topics in Signal Processing (JSTSP), 2017.
  15. “End-To-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party,” IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2022.
  16. “End-To-End Dereverberation, Beamforming, and Speech Recognition With Improved Numerical Stability and Advanced Frontend,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.
  17. “Device-directed Utterance Detection,” Interspeech, 2018.
  18. “A Neural Acoustic Echo Canceller Optimized Using an Automatic Speech Recognizer and Large Scale Synthetic Data,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.
  19. “Task Splitting for Dnn-Based Acoustic Echo and Noise Removal,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2022.
  20. “Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection,” in Workshop on Spoken Language Technology (SLT). IEEE, 2023.
  21. Pete Warden, “Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition,” arXiv:1804.03209, 2018.
  22. “ICASSP 2022 Acoustic Echo Cancellation Challenge,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.
  23. “Meta-Learning for Adaptive Filters with Higher-Order Frequency Dependencies,” in IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), 2022.
  24. “State-Space Architecture of the Partitioned-Block-Based Acoustic Echo Controller,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.
  25. “Multidelay Block Frequency Domain Adaptive Filter,” IEEE Transactions on Acoustics, Speech, and Signal Processing (TASSP), 1990.
  26. “Learning to Learn by Gradient Descent by Gradient Descent,” in NeurIPS, 2016.
  27. Advances in Network and Acoustic Echo Cancellation, Springer, 2001.
Citations (2)

Summary

We haven't generated a summary for this paper yet.