A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora (1211.4488v1)

Published 19 Nov 2012 in cs.CL and cs.AI

Abstract: The performance of a Statistical Machine Translation System (SMT) system is proportionally directed to the quality and length of the parallel corpus it uses. However for some pair of languages there is a considerable lack of them. The long term goal is to construct a Japanese-Spanish parallel corpus to be used for SMT, whereas, there are a lack of useful Japanese-Spanish parallel Corpus. To address this problem, In this study we proposed a method for extracting Japanese-Spanish Parallel Sentences from Wikipedia using POS tagging and Rule-Based approach. The main focus of this approach is the syntactic features of both languages. Human evaluation was performed over a sample and shows promising results, in comparison with the baseline.

Authors (2)

Jessica C. Ramírez (1 paper)
Yuji Matsumoto (52 papers)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora (1211.4488v1)

Summary

Related Papers