Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT (1702.07203v2)

Published 23 Feb 2017 in cs.CL

Abstract: We investigate pivot-based translation between related languages in a low resource, phrase-based SMT setting. We show that a subword-level pivot-based SMT model using a related pivot language is substantially better than word and morpheme-level pivot models. It is also highly competitive with the best direct translation model, which is encouraging as no direct source-target training corpus is used. We also show that combining multiple related language pivot models can rival a direct translation model. Thus, the use of subwords as translation units coupled with multiple related pivot languages can compensate for the lack of a direct parallel corpus.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (4)

Anoop Kunchukuttan (45 papers)
Maulik Shah (4 papers)
Pradyot Prakash (4 papers)
Pushpak Bhattacharyya (153 papers)

Citations (8)

View on Semantic Scholar

Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT (1702.07203v2)

Related Papers