Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems (2008.04574v1)

Published 11 Aug 2020 in eess.AS, cs.LG, and cs.SD

Abstract: LPCNet is an efficient vocoder that combines linear prediction and deep neural network modules to keep the computational complexity low. In this work, we present two techniques to further reduce it's complexity, aiming for a low-cost LPCNet vocoder-based neural Text-to-Speech (TTS) System. These techniques are: 1) Sample-bunching, which allows LPCNet to generate more than one audio sample per inference; and 2) Bit-bunching, which reduces the computations in the final layer of LPCNet. With the proposed bunching techniques, LPCNet, in conjunction with a Deep Convolutional TTS (DCTTS) acoustic model, shows a 2.19x improvement over the baseline run-time when running on a mobile device, with a less than 0.1 decrease in TTS mean opinion score (MOS).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Ravichander Vipperla (6 papers)
  2. Sangjun Park (13 papers)
  3. Kihyun Choo (2 papers)
  4. Samin Ishtiaq (9 papers)
  5. Kyoungbo Min (1 paper)
  6. Sourav Bhattacharya (75 papers)
  7. Abhinav Mehrotra (16 papers)
  8. Alberto Gil C. P. Ramos (7 papers)
  9. Nicholas D. Lane (97 papers)
Citations (26)

Summary

We haven't generated a summary for this paper yet.