Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
94 tokens/sec
Gemini 2.5 Pro Premium
55 tokens/sec
GPT-5 Medium
18 tokens/sec
GPT-5 High Premium
24 tokens/sec
GPT-4o
103 tokens/sec
DeepSeek R1 via Azure Premium
93 tokens/sec
GPT OSS 120B via Groq Premium
462 tokens/sec
Kimi K2 via Groq Premium
254 tokens/sec
2000 character limit reached

Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection (2008.03405v1)

Published 8 Aug 2020 in eess.AS and cs.SD

Abstract: We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device. Therefore, having small memory and compute cost is crucial for a voice trigger detection system. Recently, singular value decomposition filters (SVDFs) has been used for end-to-end voice trigger detection. The SVDFs approximate a fully-connected layer with a low rank approximation, which reduces the number of model parameters. In this work, we propose S1DCNN as an alternative approach for end-to-end small-footprint voice trigger detection. An S1DCNN layer consists of a 1D convolution layer followed by a depth-wise 1D convolution layer. We show that the SVDF can be expressed as a special case of the S1DCNN layer. Experimental results show that the S1DCNN achieve 19.0% relative false reject ratio (FRR) reduction with a similar model size and a similar time delay compared to the SVDF. By using longer time delays, the S1DCNN further improve the FRR up to 12.2% relative.

Citations (17)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube