Domain Control for Neural Machine Translation (1612.06140v2)

Published 19 Dec 2016 in cs.CL

Abstract: Machine translation systems are very sensitive to the domains they were trained on. Several domain adaptation techniques have been deeply studied. We propose a new technique for neural machine translation (NMT) that we call domain control which is performed at runtime using a unique neural network covering multiple domains. The presented approach shows quality improvements when compared to dedicated domains translating on any of the covered domains and even on out-of-domain data. In addition, model parameters do not need to be re-estimated for each domain, making this effective to real use cases. Evaluation is carried out on English-to-French translation for two different testing scenarios. We first consider the case where an end-user performs translations on a known domain. Secondly, we consider the scenario where the domain is not known and predicted at the sentence level before translating. Results show consistent accuracy improvements for both conditions.

PDF Abstract

Domain Control for Neural Machine Translation

The paper "Domain Control for Neural Machine Translation" introduces a novel approach to enhance Neural Machine Translation (NMT) systems by incorporating domain control directly into the translation process. The proposed methodology addresses the challenge wherein machine translation systems often struggle to maintain translation quality when confronted with data from domains outside their training set. This research follows a trend in machine translation, emphasizing the importance of domain adaptation to counter the deterioration of translation quality due to mismatched training and testing data domains.

Overview of the Proposed Method

The authors propose a domain control mechanism implemented during runtime using a single neural network that encompasses multiple domains. This approach contrasts with conventional methods that require re-training or parameter estimation for different domains, thereby demonstrating potential for more practical and efficient applications. Two main techniques are presented for integrating domain control into NMT: additional token and word feature. The additional token method appends a domain-specific token to each source sentence, while the word feature method extends word embeddings with additional cells encoding domain information.

Experimental Results

Experiments were conducted on an English-to-French translation task across six domains, including IT, Literature, Medical, News, Parliamentary, and Tourism. The results, quantified using BLEU scores, indicate consistent improvements in translation accuracy when domain information is incorporated, notably under the word feature approach. This improvement spans known-domain translations and scenarios where the domain is predicted at the sentence level.

Significant numerical results include an average improvement of 0.80 BLEU points with the Feature method over the Join model across test sets, which is particularly notable given the complexity of domain-specific translation adjustments. An automatic domain classification module enhances this approach by accurately predicting domain tags at the sentence level, demonstrating further improvements even in unknown domain scenarios.

Theoretical and Practical Implications

The introduction of domain control in NMT systems has profound implications. Theoretically, it underscores the potential for multi-domain adaptation within a single model, suggesting that neural networks can efficiently leverage domain-specific knowledge without compromising quality on generic data. Practically, it offers a promising solution for deploying translation systems in environments where multilingual and multi-domain translation requests are commonplace, such as global customer service platforms and international business communications.

Future Directions

The authors anticipate further exploration towards improving the feature technique by adopting a more nuanced approach to domain representation in the network—transitioning from hard decisions to a spectrum of domain proximity vectors. This extension could refine sentence-level classification capabilities and enable smoother transitions between domains.

Moreover, the paper could evolve to address document-level translation, incorporating coherent domain adaptation across entire text documents rather than isolated sentences. This advancement would align more closely with natural language usage and provide greater contextual relevance in translations.

In summary, the research presents a significant step forward in domain adaptation for NMT, promising enhanced translation quality and adaptability in real-world applications. Future developments may focus on refining these capabilities and expanding the scope to larger and more varied datasets.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Catherine Kobus (4 papers)
Josep Crego (15 papers)
Jean Senellart (17 papers)

Citations (182)

View on Semantic Scholar

Domain Control for Neural Machine Translation (1612.06140v2)