Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models (2203.15975v1)

Published 30 Mar 2022 in eess.AS, cs.HC, cs.LG, and cs.SD

Abstract: We address the problem of detecting speech directed to a device that does not contain a specific wake-word. Specifically, we focus on audio coming from a touch-based invocation. Mitigating virtual assistants (VAs) activation due to accidental button presses is critical for user experience. While the majority of approaches to false trigger mitigation (FTM) are designed to detect the presence of a target keyword, inferring user intent in absence of keyword is difficult. This also poses a challenge when creating the training/evaluation data for such systems due to inherent ambiguity in the user's data. To this end, we propose a novel FTM approach that uses weakly-labeled training data obtained with a newly introduced data sampling strategy. While this sampling strategy reduces data annotation efforts, the data labels are noisy as the data are not annotated manually. We use these data to train an acoustics-only model for the FTM task by regularizing its loss function via knowledge distillation from an ASR-based (LatticeRNN) model. This improves the model decisions, resulting in 66% gain in accuracy, as measured by equal-error-rate (EER), over the base acoustics-only model. We also show that the ensemble of the LatticeRNN and acoustic-distilled models brings further accuracy improvement of 20%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Vineet Garg (11 papers)
  2. Ognjen Rudovic (22 papers)
  3. Pranay Dighe (14 papers)
  4. Ahmed H. Abdelaziz (2 papers)
  5. Erik Marchi (18 papers)
  6. Saurabh Adya (16 papers)
  7. Chandra Dhir (10 papers)
  8. Ahmed Tewfik (25 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.