Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimizing Speech Recognition For The Edge (1909.12408v3)

Published 26 Sep 2019 in cs.CL, cs.LG, and eess.AS

Abstract: While most deployed speech recognition systems today still run on servers, we are in the midst of a transition towards deployments on edge devices. This leap to the edge is powered by the progression from traditional speech recognition pipelines to end-to-end (E2E) neural architectures, and the parallel development of more efficient neural network topologies and optimization techniques. Thus, we are now able to create highly accurate speech recognizers that are both small and fast enough to execute on typical mobile devices. In this paper, we begin with a baseline RNN-Transducer architecture comprised of Long Short-Term Memory (LSTM) layers. We then experiment with a variety of more computationally efficient layer types, as well as apply optimization techniques like neural connection pruning and parameter quantization to construct a small, high quality, on-device speech recognizer that is an order of magnitude smaller than the baseline system without any optimizations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuan Shangguan (25 papers)
  2. Jian Li (667 papers)
  3. Qiao Liang (26 papers)
  4. Raziel Alvarez (9 papers)
  5. Ian McGraw (18 papers)
Citations (64)

Summary

We haven't generated a summary for this paper yet.