Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 163 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 206 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Make BERT-based Chinese Spelling Check Model Enhanced by Layerwise Attention and Gaussian Mixture Model (2312.16623v1)

Published 27 Dec 2023 in cs.CL

Abstract: BERT-based models have shown a remarkable ability in the Chinese Spelling Check (CSC) task recently. However, traditional BERT-based methods still suffer from two limitations. First, although previous works have identified that explicit prior knowledge like Part-Of-Speech (POS) tagging can benefit in the CSC task, they neglected the fact that spelling errors inherent in CSC data can lead to incorrect tags and therefore mislead models. Additionally, they ignored the correlation between the implicit hierarchical information encoded by BERT's intermediate layers and different linguistic phenomena. This results in sub-optimal accuracy. To alleviate the above two issues, we design a heterogeneous knowledge-infused framework to strengthen BERT-based CSC models. To incorporate explicit POS knowledge, we utilize an auxiliary task strategy driven by Gaussian mixture model. Meanwhile, to incorporate implicit hierarchical linguistic knowledge within the encoder, we propose a novel form of n-gram-based layerwise self-attention to generate a multilayer representation. Experimental results show that our proposed framework yields a stable performance boost over four strong baseline models and outperforms the previous state-of-the-art methods on two datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. H. Afli, Z. Qiu, A. Way, and P. Sheridan, “Using smt for ocr error correction of historical texts,” in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 2016, pp. 962–966.
  2. J. Burstein and M. Chodorow, “Automated essay scoring for nonnative English speakers,” in Computer Mediated Language Assessment and Evaluation in Natural Language Processing, 1999.
  3. J. Gao, C. Quirk, and X.-L. Li, “A large scale ranker-based system for search query spelling correction,” in The 23rd International Conference on Computational Linguistics, August 2010.
  4. X. Liu, K. Cheng, Y. Luo, K. Duh, and Y. Matsumoto, “A hybrid chinese spelling correction using language model and statistical machine translation with reranking,” in Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing, 2013, pp. 54–58.
  5. J. Yu and Z. Li, “Chinese spelling error detection and correction based on language model, pronunciation, and shape,” in Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2014, pp. 220–223.
  6. D. Wang, Y. Tay, and L. Zhong, “Confusionset-guided pointer networks for Chinese spelling check,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 5780–5785.
  7. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019.
  8. S. Zhang, H. Huang, J. Liu, and H. Li, “Spelling error correction with soft-masked BERT,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 882–890.
  9. R. Zhang, C. Pang, C. Zhang, S. Wang, Z. He, Y. Sun, H. Wu, and H. Wang, “Correcting chinese spelling errors with phonetic pre-training,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, pp. 2250–2261.
  10. X. Cheng, W. Xu, K. Chen, S. Jiang, F. Wang, T. Wang, W. Chu, and Y. Qi, “SpellGCN: Incorporating phonological and visual similarities into language models for Chinese spelling check,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 871–881. [Online]. Available: https://aclanthology.org/2020.acl-main.81
  11. H.-D. Xu, Z. Li, Q. Zhou, C. Li, Z. Wang, Y. Cao, H. Huang, and X.-L. Mao, “Read, listen, and see: Leveraging multimodal information helps Chinese spell checking,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.   Online: Association for Computational Linguistics, Aug. 2021, pp. 716–728. [Online]. Available: https://aclanthology.org/2021.findings-acl.64
  12. Y. Yang, P. Xie, J. Tao, G. Xu, L. Li, and L. Si, “Alibaba at ijcnlp-2017 task 1: Embedding grammatical features into lstms for chinese grammatical error diagnosis task,” in Proceedings of the IJCNLP 2017, Shared Tasks, 2017, pp. 41–46.
  13. R. Fu, Z. Pei, J. Gong, W. Song, D. Teng, W. Che, S. Wang, G. Hu, and T. Liu, “Chinese grammatical error diagnosis using statistical and prior knowledge driven features with probabilistic ensemble enhancement,” in Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications, 2018, pp. 52–59.
  14. H. Xie, A. Li, Y. Li, J. Cheng, Z. Chen, X. Lyu, and Z. Tang, “Automatic chinese spelling checking and correction based on character-based pre-trained contextual representations,” in NLPCC, 2019.
  15. Y. Cao, L. He, R. Ridley, and X. Dai, “Integrating BERT and score-based feature gates for Chinese grammatical error diagnosis,” in Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications.   Suzhou, China: Association for Computational Linguistics, Dec. 2020, pp. 49–56.
  16. G. Jawahar, B. Sagot, and D. Seddah, “What does BERT learn about the structure of language?” in ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, Jul. 2019. [Online]. Available: https://hal.inria.fr/hal-02131630
  17. B. Li, Z. Zhu, G. Thomas, Y. Xu, and F. Rudzicz, “How is bert surprised? layerwise detection of linguistic anomalies,” 2021.
  18. H. Wang, B. Wang, J. Duan, and J. Zhang, “Chinese spelling error detection using a fusion lattice lstm,” Transactions on Asian and Low-Resource Language Information Processing, vol. 20, no. 2, pp. 1–11, 2021.
  19. B. He, D. Zhou, J. Xiao, X. Jiang, Q. Liu, N. J. Yuan, and T. Xu, “BERT-MK: Integrating graph contextualized knowledge into pre-trained language models,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, Nov. 2020, pp. 2281–2290. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.207
  20. T. Xia, Y. Wang, Y. Tian, and Y. Chang, “Using prior knowledge to guide bert’s attention in semantic textual matching tasks,” Proceedings of the Web Conference 2021, Apr 2021.
  21. C. Li, C. Zhang, X. Zheng, and X. Huang, “Exploration and exploitation: Two ways to improve Chinese spelling correction models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).   Online: Association for Computational Linguistics, Aug. 2021, pp. 441–446.
  22. Z. Xiao, J. Wu, Q. Chen, and C. Deng, “BERT4GCN: Using BERT intermediate layers to augment GCN for aspect-based sentiment classification,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 9193–9200.
  23. C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning (still) requires rethinking generalization,” Communications of the ACM, vol. 64, no. 3, pp. 107–115, 2021.
  24. D. Qiao, C. Dai, Y. Ding, J. Li, Q. Chen, W. Chen, and M. Zhang, “Selfmix: Robust learning against textual label noise with self-mixup training,” arXiv preprint arXiv:2210.04525, 2022.
  25. E. Arazo, D. Ortego, P. Albert, N. O’Connor, and K. McGuinness, “Unsupervised label noise modeling and loss correction,” in International conference on machine learning.   PMLR, 2019, pp. 312–321.
  26. D. Arpit, S. Jastrzebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio et al., “A closer look at memorization in deep networks,” in International conference on machine learning.   PMLR, 2017, pp. 233–242.
  27. L.-C. Yu, L.-H. Lee, Y.-H. Tseng, and H.-H. Chen, “Overview of SIGHAN 2014 bake-off for Chinese spelling check,” in Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing.   Wuhan, China: Association for Computational Linguistics, Oct. 2014, pp. 126–132. [Online]. Available: https://aclanthology.org/W14-6820
  28. Y.-H. Tseng, L.-H. Lee, L.-P. Chang, and H.-H. Chen, “Introduction to SIGHAN 2015 bake-off for Chinese spelling check,” in Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing.   Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 32–37. [Online]. Available: https://aclanthology.org/W15-3106
  29. D. Wang, Y. Song, J. Li, J. Han, and H. Zhang, “A hybrid approach to automatic corpus generation for Chinese spelling check,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.   Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 2517–2527.
  30. Y. Cui, W. Che, T. Liu, B. Qin, S. Wang, and G. Hu, “Revisiting pre-trained models for Chinese natural language processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, Nov. 2020, pp. 657–668.
  31. I. Loshchilov and F. Hutter, “Fixing weight decay regularization in adam,” 2018.
  32. Y. Li, Q. Zhou, Y. Li, Z. Li, R. Liu, R. Sun, Z. Wang, C. Li, Y. Cao, and H.-T. Zheng, “The past mistake is the future wisdom: Error-driven contrastive probability optimization for chinese spell checking,” arXiv preprint arXiv:2203.00991, 2022.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.