Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Generative Large Language Models Perform ASR Error Correction? (2307.04172v2)

Published 9 Jul 2023 in cs.CL, cs.SD, and eess.AS

Abstract: ASR error correction is an interesting option for post processing speech recognition system outputs. These error correction models are usually trained in a supervised fashion using the decoding results of a target ASR system. This approach can be computationally intensive and the model is tuned to a specific ASR system. Recently generative LLMs have been applied to a wide range of natural language processing tasks, as they can operate in a zero-shot or few shot fashion. In this paper we investigate using ChatGPT, a generative LLM, for ASR error correction. Based on the ASR N-best output, we propose both unconstrained and constrained, where a member of the N-best list is selected, approaches. Additionally, zero and 1-shot settings are evaluated. Experiments show that this generative LLM approach can yield performance gains for two different state-of-the-art ASR architectures, transducer and attention-encoder-decoder based, and multiple test sets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Rao Ma (22 papers)
  2. Mengjie Qian (20 papers)
  3. Potsawee Manakul (24 papers)
  4. Mark Gales (52 papers)
  5. Kate Knill (11 papers)
Citations (35)