Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts

Detailed Answer

Thorough responses based on abstracts and some paper content

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

116 tokens/sec

GPT-4o

74 tokens/sec

Gemini 2.5 Pro Pro

62 tokens/sec

o3 Pro

18 tokens/sec

GPT-4.1 Pro

74 tokens/sec

DeepSeek R1 via Azure Pro

24 tokens/sec

2000 character limit reached

Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection (2008.03405v1)

Published 8 Aug 2020 in eess.AS and cs.SD

Abstract: We propose a stacked 1D convolutional neural network (S1DCNN) for end-to-end small footprint voice trigger detection in a streaming scenario. Voice trigger detection is an important speech application, with which users can activate their devices by simply saying a keyword or phrase. Due to privacy and latency reasons, a voice trigger detection system should run on an always-on processor on device. Therefore, having small memory and compute cost is crucial for a voice trigger detection system. Recently, singular value decomposition filters (SVDFs) has been used for end-to-end voice trigger detection. The SVDFs approximate a fully-connected layer with a low rank approximation, which reduces the number of model parameters. In this work, we propose S1DCNN as an alternative approach for end-to-end small-footprint voice trigger detection. An S1DCNN layer consists of a 1D convolution layer followed by a depth-wise 1D convolution layer. We show that the SVDF can be expressed as a special case of the S1DCNN layer. Experimental results show that the S1DCNN achieve 19.0% relative false reject ratio (FRR) reduction with a similar model size and a similar time delay compared to the SVDF. By using longer time delays, the S1DCNN further improve the FRR up to 12.2% relative.

PDF Abstract

Authors (4)

Takuya Higuchi (26 papers)
Mohammad Ghasemzadeh (3 papers)
Kisun You (1 paper)
Chandra Dhir (10 papers)

Citations (17)

View on Semantic Scholar