Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Human-Like Translation Strategy with Large Language Models (2305.04118v3)

Published 6 May 2023 in cs.CL
Exploring Human-Like Translation Strategy with Large Language Models

Abstract: LLMs have demonstrated impressive capabilities in general scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses, human-level intelligence. Among their numerous skills, the translation abilities of LLMs have received considerable attention. Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation. This work explores this possibility by proposing the MAPS framework, which stands for Multi-Aspect Prompting and Selection. Specifically, we enable LLMs first to analyze the given source sentence and induce three aspects of translation-related knowledge: keywords, topics, and relevant demonstrations to guide the final translation process. Moreover, we employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge. Both automatic (3 LLMs x 11 directions x 2 automatic metrics) and human evaluation (preference study and MQM) demonstrate the effectiveness of MAPS. Further analysis shows that by mimicking the human translation process, MAPS reduces various translation errors such as hallucination, ambiguity, mistranslation, awkward style, untranslated text, and omission. Source code is available at https://github.com/zwhe99/MAPS-mt.

Exploration of Human-Like Translation Strategies in LLMs

The paper under discussion, "Exploring Human-Like Translation Strategy with LLM," has been accepted for publication in TACL after comprehensive technical evaluation. This paper presents an investigation into the application of translation strategies that mimic human cognitive processes through the utilization of LLMs.

Overview

In recent years, LLMs have demonstrated exceptional capabilities in various natural language processing tasks, including translation. However, the gap between machine-generated translations and human-like translations remains a subject of interest. This research attempts to bridge this gap by integrating more human-like strategies into LLM-driven translation processes. The paper investigates the potential of LLMs to not only understand linguistic constructs but also to replicate complex decision-making processes typically employed by human translators.

Methodology

The authors employ a multifaceted approach combining linguistic theories with state-of-the-art modifications to existing model architectures. Their methodological framework considers both syntactic and semantic aspects of translation, aligning machine output more closely with human intuitive reasoning. Noteworthy is the introduction of cognitive-centric model training techniques that aim to enhance translation fluency and contextual awareness.

Results

The empirical results presented are robust, substantiating the claim that incorporating human-like strategies significantly improves translation quality. Quantitative metrics show marked improvements in BLEU scores and other standard translation evaluation metrics. Additionally, human evaluators assessed translations for fluency and adequacy, providing favourable qualitative feedback. This dual evaluation approach solidifies the findings and provides a comprehensive measure of the model's performance.

Implications and Future Work

The implications of this research are considerable, particularly in enhancing machine translation systems' adaptability to complex linguistic scenarios, thus broadening their applicability in real-world settings. On a theoretical level, the paper contributes to the ongoing discourse on the emulation of human cognitive strategies within AI frameworks, suggesting that such integrations can lead to more sophisticated and nuanced LLMs.

Looking forward, the exploration of more intricate cognitive processes, such as cultural understanding and emotional nuance, presents an intriguing avenue for research. Further interdisciplinary collaboration between computational linguistics and cognitive psychology could yield even more advancements in this domain. The adaptability of LLMs to embrace such diverse cognitive skills underscores their potential to transform nuanced language tasks beyond the conventional scope of syntactic and semantic translation.

In summary, this paper contributes substantially to the body of knowledge on LLMs while providing practical enhancements to translation technology. The promising results and methodological innovations pave the way for future research that could continue to blur the lines between human and machine cognitive capabilities in language understanding and generation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. In-context examples selection for machine translation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8857–8873, Toronto, Canada. Association for Computational Linguistics.
  2. Mona Baker. 2018. In other words: A coursebook on translation. Routledge.
  3. Yehoshua Bar-Hillel. 1960. A demonstration of the nonfeasibility of fully automatic high quality translation. Advances in computers, 1:158–163.
  4. Rachel Bawden and François Yvon. 2023. Investigating the translation performance of a large multilingual language model: the case of bloom. ArXiv preprint, abs/2303.01911.
  5. Lynne Bowker. 2002. Computer-aided translation technology: A practical introduction. University of Ottawa Press.
  6. Sparks of artificial general intelligence: Early experiments with gpt-4. ArXiv preprint, abs/2303.12712.
  7. Aljoscha Burchardt. 2013. Multidimensional quality metrics: a flexible system for assessing translation quality. In Proceedings of Translating and the Computer 35, London, UK. Aslib.
  8. Video chatcaptioner: Towards the enriched spatiotemporal descriptions. ArXiv preprint, abs/2304.04227.
  9. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  10. The devil is in the errors: Leveraging large language models for fine-grained machine translation evaluation.
  11. Quality-aware decoding for neural machine translation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1396–1412, Seattle, United States. Association for Computational Linguistics.
  12. Katja Filippova. 2020. Controlled hallucinations: Learning to generate faithfully from noisy data. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 864–870, Online. Association for Computational Linguistics.
  13. Experts, errors, and context: A large-scale study of human evaluation for machine translation. Transactions of the Association for Computational Linguistics, 9:1460–1474.
  14. Results of WMT22 metrics shared task: Stop using BLEU – neural metrics are better and more robust. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 46–68, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  15. The unreasonable effectiveness of few-shot learning for machine translation. ArXiv preprint, abs/2302.01398.
  16. Dictionary-based phrase-level prompting of large language models for machine translation. ArXiv preprint, abs/2302.07856.
  17. Daniel Gile. 2009. Basic concepts and models for interpreter and translator training. Benjamins Translation Library. John Benjamins, Amsterdam.
  18. Hallucinations in large multilingual translation models. ArXiv preprint, abs/2303.16104.
  19. Looking for a needle in a haystack: A comprehensive study of hallucinations in neural machine translation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1059–1075, Dubrovnik, Croatia. Association for Computational Linguistics.
  20. Basil Hatim and Jeremy Munday. 2004. Translation: An advanced resource book. Psychology Press.
  21. The box is in the pen: Evaluating commonsense reasoning in neural machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3662–3672, Online. Association for Computational Linguistics.
  22. Bridging the data gap between training and inference for unsupervised neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6611–6623, Dublin, Ireland. Association for Computational Linguistics.
  23. How good are gpt models at machine translation? a comprehensive evaluation. ArXiv preprint, abs/2302.09210.
  24. Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12).
  25. Parrot: Translating during chat using large language models. In ArXiv.
  26. Is chatgpt a good translator? a preliminary study. In ArXiv.
  27. Marzena Karpinska and Mohit Iyyer. 2023. Large language models effectively leverage document-level context for literary translation, but critical errors persist. ArXiv preprint, abs/2304.03245.
  28. Self-generated in-context learning: Leveraging auto-regressive language models as a demonstration generator. ArXiv preprint, abs/2206.08082.
  29. Findings of the 2022 conference on machine translation (WMT22). In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 1–45, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  30. To ship or not to ship: An extensive evaluation of automatic metrics for machine translation. In Proceedings of the Sixth Conference on Machine Translation, pages 478–494, Online. Association for Computational Linguistics.
  31. Philipp Koehn. 2009. A process study of computer-aided translation. Machine Translation, 23(4):241–263.
  32. Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems, volume 35, pages 22199–22213.
  33. Self-prompting large language models for open-domain qa. ArXiv preprint, abs/2212.08635.
  34. Holistic evaluation of language models. ArXiv preprint, abs/2211.09110.
  35. Chain-of-dictionary prompting elicits translation in large language models. ArXiv preprint, abs/2305.06575.
  36. New trends in machine translation using large language models: Case examples with chatgpt. ArXiv preprint, abs/2305.01181.
  37. Elliott Macklovitch. 1995. The future of mt is now and bar-hillel was (almost entirely) right. In Proceedings of the Fourth Bar-Ilan Symposium on the Foundations of Artificial Intelligence. url: http://rali. iro. umontreal. ca/Publications/urls/bisfai95. ps.
  38. On faithfulness and factuality in abstractive summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1906–1919, Online. Association for Computational Linguistics.
  39. Shima Rahimi Moghaddam and Christopher J Honey. 2023. Boosting theory-of-mind performance in large language models via prompting. ArXiv preprint, abs/2304.11490.
  40. Adaptive machine translation with large language models. ArXiv preprint, abs/2301.13294.
  41. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  42. ToTTo: A controlled table-to-text generation dataset. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1173–1186, Online. Association for Computational Linguistics.
  43. Generative agents: Interactive simulacra of human behavior. ArXiv preprint, abs/2304.03442.
  44. Towards making the most of chatgpt for machine translation. ArXiv preprint, abs/2303.13780.
  45. Interactive-chain-prompting: Ambiguity resolution for crosslingual conditional generation with interaction. ArXiv preprint, abs/2301.10309.
  46. Anthony. Pym. 2014. Exploring Translation Theories, 2 edition. Routledge.
  47. COMET-22: Unbabel-IST 2022 submission for the metrics shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 578–585, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics. https://aclanthology.org/2022.wmt-1.52 and https://github.com/Unbabel/COMET.
  48. CometKiwi: IST-unbabel 2022 submission for the quality estimation shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 634–645, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  49. BLEURT: Learning robust metrics for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892, Online. Association for Computational Linguistics. https://aclanthology.org/2020.acl-main.704 and https://github.com/google-research/bleurt.
  50. ShareGPT. 2023. Sharegpt: Share your wildest chatgpt conversations with one click. Available at: https://sharegpt.com/.
  51. Large language models can be easily distracted by irrelevant context. ArXiv preprint, abs/2302.00093.
  52. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  53. Llama: Open and efficient foundation language models. ArXiv preprint, abs/2302.13971.
  54. Llama 2: Open foundation and fine-tuned chat models. ArXiv preprint, abs/2307.09288.
  55. Prompting palm for translation: Assessing strategies and performance. ArXiv preprint, abs/2211.09102.
  56. Guofeng: A discourse-aware evaluation benchmark for language understanding, translation and generation.
  57. Document-level machine translation with large language models. ArXiv preprint, abs/2304.02210.
  58. Understanding and improving sequence-to-sequence pretraining for neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2591–2600, Dublin, Ireland. Association for Computational Linguistics.
  59. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations.
  60. Element-aware summarization with large language models: Expert-aligned evaluation and chain-of-thought method. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8640–8665, Toronto, Canada. Association for Computational Linguistics.
  61. Self-instruct: Aligning language model with self generated instructions.
  62. Emergent abilities of large language models. Transactions on Machine Learning Research. Survey Certification.
  63. Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
  64. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  65. Chatgpt or grammarly? evaluating chatgpt on grammatical error correction benchmark. ArXiv preprint, abs/2303.13648.
  66. Large language models as optimizers. ArXiv preprint, abs/2309.03409.
  67. Generate rather than retrieve: Large language models are strong context generators. In International Conference for Learning Representation (ICLR).
  68. Prompting large language model for machine translation: A case study. ArXiv preprint, abs/2301.07069.
  69. Automatic chain of thought prompting in large language models. In The Eleventh International Conference on Learning Representations.
  70. Multimodal chain-of-thought reasoning in language models. ArXiv preprint, abs/2302.00923.
  71. Detecting hallucinated content in conditional neural sequence generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1393–1404, Online. Association for Computational Linguistics. https://aclanthology.org/2021.findings-acl.120 and https://github.com/violet-zct/fairseq-detect-hallucination.
  72. Chatgpt asks, blip-2 answers: Automatic questioning towards enriched visual descriptions. ArXiv preprint, abs/2303.06594.
  73. Multilingual machine translation with large language models: Empirical results and analysis. ArXiv preprint, abs/2304.04675.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhiwei He (42 papers)
  2. Tian Liang (50 papers)
  3. Wenxiang Jiao (44 papers)
  4. Zhuosheng Zhang (125 papers)
  5. Yujiu Yang (155 papers)
  6. Rui Wang (996 papers)
  7. Zhaopeng Tu (135 papers)
  8. Shuming Shi (126 papers)
  9. Xing Wang (191 papers)
Citations (26)
Github Logo Streamline Icon: https://streamlinehq.com