On the (In)Effectiveness of Large Language Models for Chinese Text Correction (2307.09007v2)
Abstract: Recently, the development and progress of LLMs have amazed the entire Artificial Intelligence community. Benefiting from their emergent abilities, LLMs have attracted more and more researchers to study their capabilities and performance on various downstream NLP tasks. While marveling at LLMs' incredible performance on all kinds of tasks, we notice that they also have excellent multilingual processing capabilities, such as Chinese. To explore the Chinese processing ability of LLMs, we focus on Chinese Text Correction, a fundamental and challenging Chinese NLP task. Specifically, we evaluate various representative LLMs on the Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) tasks, which are two main Chinese Text Correction scenarios. Additionally, we also fine-tune LLMs for Chinese Text Correction to better observe the potential capabilities of LLMs. From extensive analyses and comparisons with previous state-of-the-art small models, we empirically find that the LLMs currently have both amazing performance and unsatisfactory behavior for Chinese Text Correction. We believe our findings will promote the landing and application of LLMs in the Chinese NLP community.
- W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, Y. Du, C. Yang, Y. Chen, Z. Chen, J. Jiang, R. Ren, Y. Li, X. Tang, Z. Liu, P. Liu, J.-Y. Nie, and J.-R. Wen, “A survey of large language models,” 2023.
- J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler, E. H. Chi, T. Hashimoto, O. Vinyals, P. Liang, J. Dean, and W. Fedus, “Emergent abilities of large language models,” Transactions on Machine Learning Research, 2022, survey Certification. [Online]. Available: https://openreview.net/forum?id=yzkSU5zdwD
- J. Wei, X. Wang, D. Schuurmans, M. Bosma, b. ichter, F. Xia, E. Chi, Q. V. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 24 824–24 837. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
- M. He and P. N. Garner, “Can chatgpt detect intent? evaluating large language models for spoken language understanding,” 2023.
- X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y. Chen, M. Zhang, Y. Jiang, and W. Han, “Zero-shot information extraction via chatting with chatgpt,” 2023.
- X. Yang, Y. Li, X. Zhang, H. Chen, and W. Cheng, “Exploring the limits of chatgpt for query or aspect-based text summarization,” 2023.
- C.-L. Liu, M.-H. Lai, Y.-H. Chuang, and C.-Y. Lee, “Visually and phonologically similar characters in incorrect simplified Chinese words,” in Coling 2010: Posters. Beijing, China: Coling 2010 Organizing Committee, Aug. 2010, pp. 739–747. [Online]. Available: https://aclanthology.org/C10-2085
- H. Zhao, B. Wang, D. Wu, W. Che, Z. Chen, and S. Wang, “Overview of ctc 2021: Chinese text correction for native speakers,” 2022.
- Y. Wang, Y. Wang, J. Liu, and Z. Liu, “A comprehensive survey of grammar error correction,” 2020.
- J. Ye, Y. Li, S. Ma, R. Xie, W. Wu, and H.-T. Zheng, “Focus is all you need for chinese grammatical error correction,” arXiv preprint arXiv:2210.12692, 2022.
- J.-c. Wu, H.-w. Chiu, and J. S. Chang, “Integrating dictionary and web n-grams for Chinese spell checking,” in International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 4, December 2013-Special Issue on Selected Papers from ROCLING XXV, Dec. 2013. [Online]. Available: https://aclanthology.org/O13-5002
- Y. Li, Q. Zhou, Y. Li, Z. Li, R. Liu, R. Sun, Z. Wang, C. Li, Y. Cao, and H.-T. Zheng, “The past mistake is the future wisdom: Error-driven contrastive probability optimization for Chinese spell checking,” in Findings of the Association for Computational Linguistics: ACL 2022. Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3202–3213. [Online]. Available: https://aclanthology.org/2022.findings-acl.252
- D. Zhang, Y. Li, Q. Zhou, S. Ma, Y. Li, Y. Cao, and H.-T. Zheng, “Contextual similarity is more valuable than character similarity: An empirical study for chinese spell checking,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
- R. Nagata and K. Sakaguchi, “Phrase structure annotation and parsing for learner English,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 1837–1847. [Online]. Available: https://aclanthology.org/P16-1173
- S. M. Xie, A. Raghunathan, P. Liang, and T. Ma, “An explanation of in-context learning as implicit bayesian inference,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=RdJVFCHjUMI
- H. Bansal, K. Gopalakrishnan, S. Dingliwal, S. Bodapati, K. Kirchhoff, and D. Roth, “Rethinking the role of scale for in-context learning: An interpretability-based case study at 66 billion scale,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 11 833–11 856. [Online]. Available: https://aclanthology.org/2023.acl-long.660
- D. Dai, Y. Sun, L. Dong, Y. Hao, S. Ma, Z. Sui, and F. Wei, “Why can GPT learn in-context? language models secretly perform gradient descent as meta-optimizers,” in Findings of the Association for Computational Linguistics: ACL 2023. Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 4005–4019. [Online]. Available: https://aclanthology.org/2023.findings-acl.247
- Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, and Z. Sui, “A survey on in-context learning,” 2023.
- D. Wang, Y. Song, J. Li, J. Han, and H. Zhang, “A hybrid approach to automatic corpus generation for Chinese spelling check,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 2517–2527. [Online]. Available: https://aclanthology.org/D18-1273
- S.-H. Wu, C.-L. Liu, and L.-H. Lee, “Chinese spelling check evaluation at SIGHAN bake-off 2013,” in Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing, Oct. 2013, pp. 35–42. [Online]. Available: https://aclanthology.org/W13-4406
- L.-C. Yu, L.-H. Lee, Y.-H. Tseng, and H.-H. Chen, “Overview of SIGHAN 2014 bake-off for Chinese spelling check,” in Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing. Wuhan, China: Association for Computational Linguistics, Oct. 2014, pp. 126–132. [Online]. Available: https://aclanthology.org/W14-6820
- Y.-H. Tseng, L.-H. Lee, L.-P. Chang, and H.-H. Chen, “Introduction to SIGHAN 2015 bake-off for Chinese spelling check,” in Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing. Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 32–37. [Online]. Available: https://aclanthology.org/W15-3106
- H.-D. Xu, Z. Li, Q. Zhou, C. Li, Z. Wang, Y. Cao, H. Huang, and X.-L. Mao, “Read, listen, and see: Leveraging multimodal information helps Chinese spell checking,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Online: Association for Computational Linguistics, Aug. 2021, pp. 716–728. [Online]. Available: https://aclanthology.org/2021.findings-acl.64
- L. Xu, J. Wu, J. Peng, J. Fu, and M. Cai, “FCGEC: Fine-grained corpus for Chinese grammatical error correction,” in Findings of the Association for Computational Linguistics: EMNLP 2022. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 1900–1918. [Online]. Available: https://aclanthology.org/2022.findings-emnlp.137
- Y. Zhang, L. Cui, D. Cai, X. Huang, T. Fang, and W. Bi, “Multi-task instruction tuning of llama for specific scenarios: A preliminary study on writing assistance,” CoRR, vol. abs/2305.13225, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2305.13225
- Q. Lv, Z. Cao, L. Geng, C. Ai, X. Yan, and G. Fu, “General and domain-adaptive chinese spelling check with error-consistent pretraining,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 5, pp. 1–18, may 2023. [Online]. Available: https://doi.org/10.1145%2F3564271
- W. Jiang, Z. Ye, Z. Ou, R. Zhao, J. Zheng, Y. Liu, B. Liu, S. Li, Y. Yang, and Y. Zheng, “Mcscset: A specialist-annotated dataset for medical-domain chinese spelling correction,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, ser. CIKM ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 4084–4088. [Online]. Available: https://doi.org/10.1145/3511808.3557636
- Y. Zhao, N. Jiang, W. Sun, and X. Wan, “Overview of the nlpcc 2018 shared task: Grammatical error correction,” in Natural Language Processing and Chinese Computing, M. Zhang, V. Ng, D. Zhao, S. Li, and H. Zan, Eds. Cham: Springer International Publishing, 2018, pp. 439–445.
- Y. Zhang, Z. Li, Z. Bao, J. Li, B. Zhang, C. Li, F. Huang, and M. Zhang, “MuCGEC: a multi-reference multi-source evaluation dataset for Chinese grammatical error correction,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 3118–3130. [Online]. Available: https://aclanthology.org/2022.naacl-main.227
- S. Ma, Y. Li, R. Sun, Q. Zhou, S. Huang, D. Zhang, L. Yangning, R. Liu, Z. Li, Y. Cao, H. Zheng, and Y. Shen, “Linguistic rules-based corpus generation for native Chinese grammatical error correction,” in Findings of the Association for Computational Linguistics: EMNLP 2022. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 576–589. [Online]. Available: https://aclanthology.org/2022.findings-emnlp.40
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423
- S. Zhang, H. Huang, J. Liu, and H. Li, “Spelling error correction with soft-masked BERT,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, Jul. 2020, pp. 882–890. [Online]. Available: https://aclanthology.org/2020.acl-main.82
- C. Li, C. Zhang, X. Zheng, and X. Huang, “Exploration and exploitation: Two ways to improve Chinese spelling correction models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Online: Association for Computational Linguistics, Aug. 2021, pp. 441–446. [Online]. Available: https://aclanthology.org/2021.acl-short.56
- Y. Li, S. Ma, Q. Zhou, Z. Li, L. Yangning, S. Huang, R. Liu, C. Li, Y. Cao, and H. Zheng, “Learning from the dictionary: Heterogeneous knowledge guided fine-tuning for Chinese spell checking,” in Findings of the Association for Computational Linguistics: EMNLP 2022. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 238–249. [Online]. Available: https://aclanthology.org/2022.findings-emnlp.18
- Y. Shao, Z. Geng, Y. Liu, J. Dai, F. Yang, L. Zhe, H. Bao, and X. Qiu, “Cpt: A pre-trained unbalanced transformer for both chinese language understanding and generation,” arXiv preprint arXiv:2109.05729, 2021.
- M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7871–7880.
- C. Dong, Y. Li, H. Gong, M. Chen, J. Li, Y. Shen, and M. Yang, “A survey of natural language generation,” ACM Computing Surveys, vol. 55, no. 8, pp. 1–38, 2022.
- Y. Zhang, Z. Li, Z. Bao, J. Li, B. Zhang, C. Li, F. Huang, and M. Zhang, “Mucgec: a multi-reference multi-source evaluation dataset for chinese grammatical error correction,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 3118–3130.
- W. Wang, B. Bi, M. Yan, C. Wu, J. Xia, Z. Bao, L. Peng, and L. Si, “Structbert: Incorporating language structures into pre-training for deep language understanding,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [Online]. Available: https://openreview.net/forum?id=BJgQ4lSFPH
- K. Omelianchuk, V. Atrasevych, A. Chernodub, and O. Skurzhanskyi, “Gector–grammatical error correction: Tag, not rewrite,” in Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, 2020, pp. 163–170.
- W.-L. Chiang, Z. Li, Z. Lin, Y. Sheng, Z. Wu, H. Zhang, L. Zheng, S. Zhuang, Y. Zhuang, J. E. Gonzalez, I. Stoica, and E. P. Xing, “Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality,” March 2023. [Online]. Available: https://lmsys.org/blog/2023-03-30-vicuna/
- Z. Du, Y. Qian, X. Liu, M. Ding, J. Qiu, Z. Yang, and J. Tang, “Glm: General language model pretraining with autoregressive blank infilling,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 320–335.
- J. Ye, Y. Li, Q. Zhou, Y. Li, S. Ma, H.-T. Zheng, and Y. Shen, “Cleme: Debiasing multi-reference evaluation for grammatical error correction,” arXiv preprint arXiv:2305.10819, 2023.
- B. Zhang, “Features and functions of the hsk dynamic composition corpus,” International Chinese Language Education, vol. 4, pp. 71–79, 2009.
- G. Rao, Q. Gong, B. Zhang, and E. Xun, “Overview of NLPTEA-2018 share task Chinese grammatical error diagnosis,” in Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. Melbourne, Australia: Association for Computational Linguistics, Jul. 2018, pp. 42–51. [Online]. Available: https://aclanthology.org/W18-3706
- G. Rao, E. Yang, and B. Zhang, “Overview of NLPTEA-2020 shared task for Chinese grammatical error diagnosis,” in Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Linguistics, Dec. 2020, pp. 25–35. [Online]. Available: https://aclanthology.org/2020.nlptea-1.4
- Y. Wang, C. Kong, L. Yang, Y. Wang, X. Lu, R. Hu, S. He, Z. Liu, Y. Chen, E. Yang, and M. Sun, “Yaclc: A chinese learner corpus with multidimensional annotation,” 2021.
- H. Wu, W. Wang, Y. Wan, W. Jiao, and M. Lyu, “Chatgpt or grammarly? evaluating chatgpt on grammatical error correction benchmark,” 2023.
- T. Fang, S. Yang, K. Lan, D. F. Wong, J. Hu, L. S. Chao, and Y. Zhang, “Is chatgpt a highly fluent grammatical error correction system? a comprehensive evaluation,” 2023.
- S. Coyne, K. Sakaguchi, D. Galvan-Sosa, M. Zock, and K. Inui, “Analyzing the performance of gpt-3.5 and gpt-4 in grammatical error correction,” 2023.
- Yinghui Li (65 papers)
- Haojing Huang (10 papers)
- Shirong Ma (23 papers)
- Yong Jiang (194 papers)
- Yangning Li (49 papers)
- Feng Zhou (195 papers)
- Hai-Tao Zheng (94 papers)
- Qingyu Zhou (28 papers)