Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models (2407.01909v1)

Published 2 Jul 2024 in cs.CL, cs.SD, and eess.AS

Abstract: Recent studies have demonstrated the efficacy of LLMs in error correction for automatic speech recognition (ASR). However, much of the research focuses on the English language. This paper redirects the attention to Chinese. Firstly, we construct a specialized benchmark dataset aimed at error correction for Chinese ASR with 724K hypotheses-transcription pairs, named the Chinese Hypotheses Paradise dataset (ChineseHP), which contains a wide range of scenarios and presents significant challenges. Subsequently, we conduct a preliminary evaluation using the dataset for both direct-prompting and fine-tuning pre-trained LLMs. Furthermore, we propose a straightforward method of Pinyin regularization for prompts, which involves the transcription of Pinyin directly from text hypotheses. The experimental results reveal that Pinyin regularization consistently enhances the error-correcting ability of LLMs when compared with those without regularization. The dataset is available on the website.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (20)

Authors (4)

Zhiyuan Tang (34 papers)
Dong Wang (628 papers)
Shen Huang (25 papers)
Shidong Shang (10 papers)

Citations (2)

View on Semantic Scholar

Tweets

https://twitter.com/ArxivSound/status/1808606381336887315

Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models (2407.01909v1)

Related Papers

Tweets