Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking (2403.08492v3)

Published 13 Mar 2024 in cs.CL

Abstract: Chinese Spell Checking (CSC) is a widely used technology, which plays a vital role in speech to text (STT) and optical character recognition (OCR). Most of the existing CSC approaches relying on BERT architecture achieve excellent performance. However, limited by the scale of the foundation model, BERT-based method does not work well in few-shot scenarios, showing certain limitations in practical applications. In this paper, we explore using an in-context learning method named RS-LLM (Rich Semantic based LLMs) to introduce LLMs as the foundation model. Besides, we study the impact of introducing various Chinese rich semantic information in our framework. We found that by introducing a small number of specific Chinese rich semantic structures, LLMs achieve better performance than the BERT-based model on few-shot CSC task. Furthermore, we conduct experiments on multiple datasets, and the experimental results verified the superiority of our proposed framework.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Rethinking the role of scale for in-context learning: An interpretability-based case study at 66 billion scale. In ACL (1), pages 11833–11856. Association for Computational Linguistics.
  2. Language models are few-shot learners. In NeurIPS.
  3. A study of language modeling for chinese spelling check. In SIGHAN@IJCNLP, pages 79–83. Asian Federation of Natural Language Processing.
  4. Detection of grammatical errors involving prepositions. In Proceedings of the fourth ACL-SIGSEM workshop on prepositions, pages 25–30.
  5. Why can GPT learn in-context? language models secretly perform gradient descent as meta-optimizers. In ACL (Findings), pages 4005–4019. Association for Computational Linguistics.
  6. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171–4186. Association for Computational Linguistics.
  7. A survey for in-context learning. CoRR, abs/2301.00234.
  8. Faspell: A fast, adaptable, simple, powerful chinese spell checker based on dae-decoder paradigm. In W-NUT@EMNLP, pages 160–169. Association for Computational Linguistics.
  9. Spellbert: A lightweight pretrained model for chinese spelling check. In EMNLP (1), pages 3544–3551. Association for Computational Linguistics.
  10. A rule based chinese spelling and grammar detection system utility. In 2012 International Conference on System Science and Engineering (ICSSE), pages 437–440. IEEE.
  11. Large language models understand and can be enhanced by emotional stimuli. arXiv preprint arXiv:2307.11760.
  12. Improving chinese spelling check by character pronunciation prediction: The effects of adaptivity and granularity. In EMNLP, pages 4275–4286. Association for Computational Linguistics.
  13. On the (in)effectiveness of large language models for chinese text correction. CoRR, abs/2307.09007.
  14. Visually and phonologically similar characters in incorrect simplified chinese words. In COLING (Posters), pages 739–747. Chinese Information Processing Society of China.
  15. PLOME: pre-training with misspelled knowledge for chinese spelling correction. In ACL/IJCNLP (1), pages 2991–3000. Association for Computational Linguistics.
  16. A hybrid chinese spelling correction using language model and statistical machine translation with reranking. In SIGHAN@IJCNLP, pages 54–58. Asian Federation of Natural Language Processing.
  17. General and domain-adaptive chinese spelling check with error-consistent pretraining. ACM Trans. Asian Low Resour. Lang. Inf. Process., 22(5):124:1–124:18.
  18. FELIX: flexible text editing through tagging and insertion. In EMNLP (Findings), volume EMNLP 2020 of Findings of ACL, pages 1244–1255. Association for Computational Linguistics.
  19. Encode, tag, realize: High-precision text editing. In EMNLP/IJCNLP (1), pages 5053–5064. Association for Computational Linguistics.
  20. Lidia Mangu and Eric Brill. 1997. Automatic rule acquisition for spelling correction. In ICML, pages 187–194. Morgan Kaufmann.
  21. Chinesebert: Chinese pretraining enhanced by glyph and pinyin information. In ACL/IJCNLP (1), pages 2065–2075. Association for Computational Linguistics.
  22. Introduction to SIGHAN 2015 bake-off for chinese spelling check. In SIGHAN@IJCNLP, pages 32–37. Association for Computational Linguistics.
  23. An explanation of in-context learning as implicit bayesian inference. In ICLR. OpenReview.net.
  24. Chinese spelling check system based on n-gram model. In SIGHAN@IJCNLP, pages 128–136. Association for Computational Linguistics.
  25. Read, listen, and see: Leveraging multimodal information helps chinese spell checking. In ACL/IJCNLP (Findings), volume ACL/IJCNLP 2021 of Findings of ACL, pages 716–728. Association for Computational Linguistics.
  26. Junjie Yu and Zhenghua Li. 2014. Chinese spelling error detection and correction based on language model, pronunciation, and shape. In CIPS-SIGHAN, pages 220–223. Association for Computational Linguistics.
  27. Spelling error correction with soft-masked BERT. In ACL, pages 882–890. Association for Computational Linguistics.
  28. Investigating glyph-phonetic information for chinese spell checking: What works and what’s next? In ACL (Findings), pages 1–13. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ming Dong (38 papers)
  2. Yujing Chen (6 papers)
  3. Miao Zhang (147 papers)
  4. Hao Sun (383 papers)
  5. Tingting He (6 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets