A Chat About Boring Problems: Studying GPT-based text normalization (2309.13426v2)
Abstract: Text normalization - the conversion of text from written to spoken form - is traditionally assumed to be an ill-formed task for LLMs. In this work, we argue otherwise. We empirically show the capacity of Large-LLMs (LLM) for text normalization in few-shot scenarios. Combining self-consistency reasoning with linguistic-informed prompt engineering, we find LLM based text normalization to achieve error rates around 40\% lower than top normalization systems. Further, upon error analysis, we note key limitations in the conventional design of text normalization tasks. We create a new taxonomy of text normalization errors and apply it to results from GPT-3.5-Turbo and GPT-4.0. Through this new framework, we can identify strengths and weaknesses of GPT-based TN, opening opportunities for future work.
- “The kestrel tts text normalization system,” Natural Language Engineering, 2015.
- Introduction to Automata Theory, Languages, and Computation (3rd Edition), Addison-Wesley Longman Publishing Co., Inc., USA, 2006.
- “Normalization of non-standard words,” Computer speech & language, vol. 15, no. 3, pp. 287–333, 2001.
- “The OpenGrm open-source finite-state grammar software libraries,” in Proceedings of the ACL 2012 System Demonstrations, Min Zhang, Ed., Jeju Island, Korea, July 2012, pp. 61–66, Association for Computational Linguistics.
- “On using monolingual corpora in neural machine translation,” CoRR, vol. abs/1503.03535, 2015.
- “Proteno: Text normalization with limited data for fast deployment in text to speech systems,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, Online, June 2021, pp. 72–79, Association for Computational Linguistics.
- “An RNN model of text normalization,” in Interspeech, 2017.
- “Neural text normalization with subword units,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers), 2019.
- “Neural models of text normalization for speech applications,” Computational Linguistics, 2019.
- “Improving neural text normalization with partial parameter generator and pointer-generator network,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 7583–7587.
- “NeMo (Inverse) Text Normalization: From Development to Production,” in Proc. Interspeech 2021, 2021, pp. 4857–4859.
- “A Mostly Data-Driven Approach to Inverse Text Normalization,” in Proc. Interspeech 2017, 2017, pp. 2784–2788.
- Richard Sproat, “Boring problems are sometimes the most interesting,” Computational Linguistics, vol. 48, no. 2, pp. 483–490, June 2022.
- “NeMo (Inverse) Text Normalization: From Development to Production,” in Interspeech, 2021.
- John L. Austin, How to do things with words, William James Lectures delivered at Harvard University in 1955. Harvard University Press, Cambridge, Massachusetts, second edition. edition, 1975 - 1975.
- “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- OpenAI, “Gpt-4 technical report,” 2023.
- Paul Taylor, Text-to-Speech Synthesis, Cambridge University Press, 2009.
- Daan van Esch and Richard Sproat, “An Expanded Taxonomy of Semiotic Classes for Text Normalization,” in Proc. Interspeech 2017, 2017, pp. 4016–4020.
- “Self-consistency improves chain of thought reasoning in language models,” arXiv preprint arXiv:2203.11171, 2022.
- “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, pp. 9, 2019.
- “The curious case of neural text degeneration,” arXiv preprint arXiv:1904.09751, 2019.
- “Lost in the middle: How language models use long contexts,” arXiv preprint arXiv:2307.03172, 2023.
- Yang Zhang (1129 papers)
- Travis M. Bartley (3 papers)
- Mariana Graterol-Fuenmayor (1 paper)
- Vitaly Lavrukhin (32 papers)
- Evelina Bakhturina (21 papers)
- Boris Ginsburg (111 papers)