Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Reliable Neural Machine Translation with Consistency-Aware Meta-Learning (2303.10966v2)

Published 20 Mar 2023 in cs.CL

Abstract: Neural machine translation (NMT) has achieved remarkable success in producing high-quality translations. However, current NMT systems suffer from a lack of reliability, as their outputs that are often affected by lexical or syntactic changes in inputs, resulting in large variations in quality. This limitation hinders the practicality and trustworthiness of NMT. A contributing factor to this problem is that NMT models trained with the one-to-one paradigm struggle to handle the source diversity phenomenon, where inputs with the same meaning can be expressed differently. In this work, we treat this problem as a bilevel optimization problem and present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it. Specifically, the NMT model with CAML (named CoNMT) first learns a consistent meta representation of semantically equivalent sentences in the outer loop. Subsequently, a mapping from the meta representation to the output sentence is learned in the inner loop, allowing the NMT model to translate semantically equivalent sentences to the same target sentence. We conduct experiments on the NIST Chinese to English task, three WMT translation tasks, and the TED M2O task. The results demonstrate that CoNMT effectively improves overall translation quality and reliably handles diverse inputs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Knowledge distillation from internal representations. In AAAI, 7350–7357.
  2. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473.
  3. Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks. In EMNLP, 522–534.
  4. Language models are few-shot learners. Neurips, 1877–1901.
  5. Tagged back-translation. arXiv:1906.06442.
  6. Robust Neural Machine Translation with Doubly Adversarial Inputs. In ACL, 3044–3049.
  7. AdvAug: Robust Adversarial Augmentation for Neural Machine Translation. In ACL, 5961–5970.
  8. Towards Robust Neural Machine Translation. In ACL, 1756–1766.
  9. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In EMNLP, 1724–1734.
  10. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL, 4171–4186.
  11. On Adversarial Examples for Character-Level Neural Machine Translation. In Coling, 653–663.
  12. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, 1126–1135.
  13. Soft Contextual Data Augmentation for Neural Machine Translation. In ACL, 5539–5544.
  14. Mask-Predict: Parallel Decoding of Conditional Masked Language Models. In EMNLP-IJCNLP, 6112–6121.
  15. Non-autoregressive neural machine translation. arXiv preprint arXiv:1711.02281.
  16. Meta-Learning for Low-Resource Neural Machine Translation. In EMNLP, 3622–3631.
  17. Multi-Granularity Self-Attention for Neural Machine Translation. In EMNLP-IJCNLP, 887–897.
  18. Structure-invariant testing for machine translation. In ICSE, 961–973. IEEE.
  19. Iterative Back-Translation for Neural Machine Translation. In NMTG, 18–24.
  20. Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation. In EMNLP, 2255–2266.
  21. Cross-lingual Language Model Pretraining. arXiv preprint arXiv:1901.07291.
  22. Learning to generalize: Meta-learning for domain generalization. In AAAI, volume 32.
  23. Multi-Head Attention with Disagreement Regularization. In EMNLP, 2897–2903.
  24. Robust Neural Machine Translation with Joint Textual and Phonetic Embedding. In ACL, 3044–3049.
  25. Multilingual Denoising Pre-training for Neural Machine Translation. TACL, 726–742.
  26. Effective Approaches to Attention-based Neural Machine Translation. In EMNLP, 1412–1421.
  27. compare-mt: A Tool for Holistic Comparison of Language Generation Systems. CoRR, abs/1903.07926.
  28. Data Diversification: A Simple Strategy For Neural Machine Translation. In NIPS, 10018–10029.
  29. Analyzing uncertainty in neural machine translation. In ICML, 3956–3965.
  30. BLEU: A Method for Automatic Evaluation of Machine Translation. In ACL, 311–318.
  31. Domain Adaptive Dialog Generation via Meta Learning. In ACL, 2639–2649.
  32. Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 21(140): 1–67.
  33. Improving Neural Machine Translation Models with Monolingual Data. In ACL, 86–96.
  34. Neural Machine Translation of Rare Words with Subword Units. In ACL, 1715–1725.
  35. Toward robust neural machine translation for noisy input sequences. In ICSLT, 90–96.
  36. Sequence to Sequence Learning with Neural Networks. In NIPS, volume 27.
  37. Attention is all you need. In NIPS, volume 30.
  38. Learning Deep Transformer Models for Machine Translation. In ACL, 1810–1822.
  39. SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation. In EMNLP, 856–861.
  40. Balancing Training for Multilingual Neural Machine Translation. In ACL, 8526–8537.
  41. Training noise-robust deep neural networks via meta-learning. In CVPR, 4524–4533.
  42. Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation. In ACL, 7930–7944.
  43. Uncertainty-Aware Semantic Augmentation for Neural Machine Translation. In EMNLP, 2724–2735.
  44. Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training. In EMNLP, 7291–7305.
  45. Context-aware self-attention networks. In AAAI, 387–394.
  46. Meta-learning with an Adaptive Task Scheduler. In Neurips, 7497–7509.
  47. Uncertainty-Aware Curriculum Learning for Neural Machine Translation. In ACL, 6934–6944.
  48. Incorporating BERT into Neural Machine Translation. In arXiv:2002.06823.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Rongxiang Weng (26 papers)
  2. Qiang Wang (271 papers)
  3. Wensen Cheng (3 papers)
  4. Changfeng Zhu (2 papers)
  5. Min Zhang (630 papers)
Citations (2)