Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring the Capabilities of ChatGPT in Ancient Chinese Translation and Person Name Recognition (2312.15304v2)

Published 23 Dec 2023 in cs.CL and cs.AI

Abstract: ChatGPT's proficiency in handling modern standard languages suggests potential for its use in understanding ancient Chinese. This paper explores ChatGPT's capabilities on ancient Chinese via two tasks: translating ancient Chinese to modern Chinese and recognizing ancient Chinese names. A comparison of ChatGPT's output with human translations serves to evaluate its comprehension of ancient Chinese. The findings indicate that: (1.)the proficiency of ancient Chinese by ChatGPT is yet to reach a satisfactory level; (2.) ChatGPT performs the best on ancient-to-modern translation when feeding with three context sentences. To help reproduce our work, we display the python code snippets used in this study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Proceedings of the ancient language processing workshop. In Proceedings of the Ancient Language Processing Workshop.
  2. Towards a robust detection of language model generated text: Is chatgpt that easy to detect? arXiv preprint arXiv:2306.05871.
  3. Chatgpt is a knowledgeable but inexperienced solver: An investigation of commonsense problem in large language models.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  5. Assessing cross-cultural alignment between chatgpt and human societies: An empirical study.
  6. Sikugpt: A generative pre-trained model for intelligent information processing of ancient texts from the perspective of digital humanities. arXiv preprint arXiv:2304.07778.
  7. Integration of automatic sentence segmentation and lexical analysis of ancient chinese based on bilstm-crf model. In Proceedings of LT4HALA 2020-1st Workshop on Language Technologies for Historical and Ancient Languages, pages 52–58.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  9. Is chatgpt a highly fluent grammatical error correction system? a comprehensive evaluation.
  10. A review of chatgpt ai’s impact on several business sectors. Partners Universal International Innovation Journal, 1(1):9–23.
  11. Chatgpt outperforms crowd-workers for text-annotation tasks.
  12. Towards effective ancient chinese translation: Dataset, model, and evaluation. In CCF International Conference on Natural Language Processing and Chinese Computing, pages 416–427. Springer.
  13. Evaluating and combining name entity recognition systems. In Proceedings of the Sixth Named Entity Workshop, pages 21–27.
  14. Morphological and semantic evaluation of ancient chinese machine translation. In Proceedings of the Ancient Language Processing Workshop, pages 96–102.
  15. Gptaraeval: A comprehensive evaluation of chatgpt on arabic nlp.
  16. Chatgpt: Jack of all trades, master of none. Information Fusion, page 101861.
  17. Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning.
  18. Lan, P. (2023). Chatgpt on i ching at six levels. International Journal of Multidisciplinary Research and Publications (IJMRAP), 5(9):173–183.
  19. Glimpses of ancient china from classical chinese poems. In Proceedings of COLING 2012: Posters, pages 621–632.
  20. Is chatgpt a good recommender? a preliminary study.
  21. Comparative analysis of chatgpt and the evolution of language models.
  22. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.
  23. Ray, P. P. (2023). Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems.
  24. Exploring new frontiers in agricultural nlp: Investigating the potential of large language models for food applications.
  25. Chatgpt as a tool for user story quality evaluation: Trustworthy out of the box?
  26. Students need more attention: Bert-based attention model for small data with application to automatic patient message triage. In Machine Learning for Healthcare Conference, pages 436–456. PMLR.
  27. Sentence similarity computation in question answering robot. In Journal of Physics: Conference Series, volume 1237, page 022093. IOP Publishing.
  28. Pushing the limits of chatgpt on nlp tasks.
  29. Methods for numeracy-preserving word embeddings. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4742–4753.
  30. Simple tagging system with roberta for ancient chinese. In Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, pages 159–163.
  31. Anchibert: A pre-trained model for ancient chinese language understanding and generation. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE.
  32. Gujibert and gujigpt: Construction of intelligent information processing foundation language models for ancient texts. arXiv preprint arXiv:2307.05354.
  33. Sikubert and sikuroberta: The construction and application of pre-trained models for digital hunanity oriented sikuquanshu. Forum of Library, 42(6):14.
  34. Document-level machine translation with large language models.
  35. Enhancing ancient chinese understanding with derived noisy syntax trees. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 83–92.
  36. Is chatgpt a good teacher coach? measuring zero-shot performance for scoring and providing actionable insights on classroom instruction.
  37. Assessing phrase break of esl speech with pre-trained language models and large language models.
  38. Qualifying chinese medical licensing examination with knowledge enhanced generative pre-training model.
  39. A brief overview of chatgpt: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica, 10(5):1122–1136.
  40. A machine learning model for the dating of ancient chinese texts. In 2019 International Conference on Asian Language Processing (IALP), pages 115–120. IEEE.
  41. People name recognition from ancient chinese literature using distant supervision and deep learning. In 2021 2nd International Conference on Artificial Intelligence and Information Systems, pages 1–6.
  42. Cross-lingual cross-temporal summarization: Dataset, models, evaluation.
  43. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Siqing Zhou (1 paper)
  2. Shijing Si (32 papers)
  3. Le Tang (2 papers)
  4. Xiaoqing Cheng (6 papers)
  5. Yugui Zhang (4 papers)