General2Specialized LLMs Translation for E-commerce (2403.03689v2)
Abstract: Existing Neural Machine Translation (NMT) models mainly handle translation in the general domain, while overlooking domains with special writing formulas, such as e-commerce and legal documents. Taking e-commerce as an example, the texts usually include amounts of domain-related words and have more grammar problems, which leads to inferior performances of current NMT methods. To address these problems, we collect two domain-related resources, including a set of term pairs (aligned Chinese-English bilingual terms) and a parallel corpus annotated for the e-commerce domain. Furthermore, we propose a two-step fine-tuning paradigm (named G2ST) with self-contrastive semantic enhancement to transfer one general NMT model to the specialized NMT model for e-commerce. The paradigm can be used for the NMT models based on LLMs. Extensive evaluations on real e-commerce titles demonstrate the superior translation quality and robustness of our G2ST approach, as compared with state-of-the-art NMT models such as LLaMA, Qwen, GPT-3.5, and even GPT-4.
- Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
- Rachel Bawden and François Yvon. 2023. Investigating the Translation Performance of a Large Multilingual Language Model: the Case of Bloom. arXiv preprint arXiv:2303.01911 (2023).
- No Language Left Behind: Scaling Human-centered Machine Translation. arXiv preprint arXiv:2207.04672 (2022).
- Beyond English-centric Multilingual Machine Translation. The Journal of Machine Learning Research (2021), 4839–4886.
- Is ChatGPT a good translator? Yes with GPT-4 as the Engine. arXiv preprint arXiv:2301.08745 (2023).
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Chin-Yew Lin. 2004. Rouge: A Package for Automatic Evaluation of Summaries. In Text summarization branches out. 74–81.
- Multilingual Denoising Pre-training for Neural Machine Translation. Transactions of the Association for Computational Linguistics (2020), 726–742.
- Crosslingual Generalization through Multitask Finetuning. In Association for Computational Linguistics. 15991–16111.
- OpenAI. 2022. Introducing ChatGPT. OpenAI blog (2022).
- Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers. 186–191.
- Bloom: A 176b-parameter Open-access Multilingual Language Model. arXiv preprint arXiv:2211.05100 (2022).
- Neural Machine Translation of Rare Words with Subword Units. In Association for Computational Linguistics. 1715–1725.
- Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (2023).
- Attention is all you need. Advances in neural information processing systems (2017).
- R-drop: Regularized Dropout for Neural Networks. Advances in Neural Information Processing Systems (2021), 10890–10905.
- mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Conference of the North American Chapter of the Association for Computational Linguistics. 483–498.
- Opt: Open Pre-trained Transformer Language Models. arXiv preprint arXiv:2205.01068 (2022).
- Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis. arXiv preprint arXiv:2304.04675 (2023).
- Kaidi Chen (2 papers)
- Ben Chen (23 papers)
- Dehong Gao (26 papers)
- Huangyu Dai (4 papers)
- Wen Jiang (52 papers)
- Wei Ning (48 papers)
- Shanqing Yu (41 papers)
- Libin Yang (17 papers)
- Xiaoyan Cai (15 papers)