Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging (1908.02404v1)

Published 7 Aug 2019 in cs.CL

Abstract: In recent years, studies on automatic speech recognition (ASR) have shown outstanding results that reach human parity on short speech segments. However, there are still difficulties in standardizing the output of ASR such as capitalization and punctuation restoration for long-speech transcription. The problems obstruct readers to understand the ASR output semantically and also cause difficulties for natural language processing models such as NER, POS and semantic parsing. In this paper, we propose a method to restore the punctuation and capitalization for long-speech ASR transcription. The method is based on Transformer models and chunk merging that allows us to (1), build a single model that performs punctuation and capitalization in one go, and (2), perform decoding in parallel while improving the prediction accuracy. Experiments on British National Corpus showed that the proposed approach outperforms existing methods in both accuracy and decoding speed.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Binh Nguyen (21 papers)
  2. Vu Bao Hung Nguyen (1 paper)
  3. Hien Nguyen (33 papers)
  4. Pham Ngoc Phuong (2 papers)
  5. The-Loc Nguyen (1 paper)
  6. Quoc Truong Do (5 papers)
  7. Luong Chi Mai (2 papers)
Citations (43)

Summary

We haven't generated a summary for this paper yet.