Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

General2Specialized LLMs Translation for E-commerce (2403.03689v2)

Published 6 Mar 2024 in cs.CL and cs.AI

Abstract: Existing Neural Machine Translation (NMT) models mainly handle translation in the general domain, while overlooking domains with special writing formulas, such as e-commerce and legal documents. Taking e-commerce as an example, the texts usually include amounts of domain-related words and have more grammar problems, which leads to inferior performances of current NMT methods. To address these problems, we collect two domain-related resources, including a set of term pairs (aligned Chinese-English bilingual terms) and a parallel corpus annotated for the e-commerce domain. Furthermore, we propose a two-step fine-tuning paradigm (named G2ST) with self-contrastive semantic enhancement to transfer one general NMT model to the specialized NMT model for e-commerce. The paradigm can be used for the NMT models based on LLMs. Extensive evaluations on real e-commerce titles demonstrate the superior translation quality and robustness of our G2ST approach, as compared with state-of-the-art NMT models such as LLaMA, Qwen, GPT-3.5, and even GPT-4.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
  2. Rachel Bawden and François Yvon. 2023. Investigating the Translation Performance of a Large Multilingual Language Model: the Case of Bloom. arXiv preprint arXiv:2303.01911 (2023).
  3. No Language Left Behind: Scaling Human-centered Machine Translation. arXiv preprint arXiv:2207.04672 (2022).
  4. Beyond English-centric Multilingual Machine Translation. The Journal of Machine Learning Research (2021), 4839–4886.
  5. Is ChatGPT a good translator? Yes with GPT-4 as the Engine. arXiv preprint arXiv:2301.08745 (2023).
  6. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  7. Chin-Yew Lin. 2004. Rouge: A Package for Automatic Evaluation of Summaries. In Text summarization branches out. 74–81.
  8. Multilingual Denoising Pre-training for Neural Machine Translation. Transactions of the Association for Computational Linguistics (2020), 726–742.
  9. Crosslingual Generalization through Multitask Finetuning. In Association for Computational Linguistics. 15991–16111.
  10. OpenAI. 2022. Introducing ChatGPT. OpenAI blog (2022).
  11. Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers. 186–191.
  12. Bloom: A 176b-parameter Open-access Multilingual Language Model. arXiv preprint arXiv:2211.05100 (2022).
  13. Neural Machine Translation of Rare Words with Subword Units. In Association for Computational Linguistics. 1715–1725.
  14. Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (2023).
  15. Attention is all you need. Advances in neural information processing systems (2017).
  16. R-drop: Regularized Dropout for Neural Networks. Advances in Neural Information Processing Systems (2021), 10890–10905.
  17. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Conference of the North American Chapter of the Association for Computational Linguistics. 483–498.
  18. Opt: Open Pre-trained Transformer Language Models. arXiv preprint arXiv:2205.01068 (2022).
  19. Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis. arXiv preprint arXiv:2304.04675 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Kaidi Chen (2 papers)
  2. Ben Chen (23 papers)
  3. Dehong Gao (26 papers)
  4. Huangyu Dai (4 papers)
  5. Wen Jiang (52 papers)
  6. Wei Ning (48 papers)
  7. Shanqing Yu (41 papers)
  8. Libin Yang (17 papers)
  9. Xiaoyan Cai (15 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com