Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting (2309.15649v2)

Published 27 Sep 2023 in cs.CL, cs.AI, cs.LG, cs.SD, and eess.AS

Abstract: We explore the ability of LLMs to act as speech recognition post-processors that perform rescoring and error correction. Our first focus is on instruction prompting to let LLMs perform these task without fine-tuning, for which we evaluate different prompting schemes, both zero- and few-shot in-context learning, and a novel task activation prompting method that combines causal instructions and demonstration to increase its context windows. Next, we show that rescoring only by in-context learning with frozen LLMs achieves results that are competitive with rescoring by domain-tuned LMs, using a pretrained first-pass recognition system and rescoring output on two out-of-domain tasks (ATIS and WSJ). By combining prompting techniques with fine-tuning we achieve error rates below the N-best oracle level, showcasing the generalization power of the LLMs.

References (50)

Authors (6)

Chao-Han Huck Yang (89 papers)
Yile Gu (25 papers)
Yi-Chieh Liu (10 papers)
Shalini Ghosh (34 papers)
Ivan Bulyko (23 papers)
Andreas Stolcke (57 papers)

Citations (33)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting (2309.15649v2)

Summary

Related Papers