Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora (1211.4488v1)

Published 19 Nov 2012 in cs.CL and cs.AI

Abstract: The performance of a Statistical Machine Translation System (SMT) system is proportionally directed to the quality and length of the parallel corpus it uses. However for some pair of languages there is a considerable lack of them. The long term goal is to construct a Japanese-Spanish parallel corpus to be used for SMT, whereas, there are a lack of useful Japanese-Spanish parallel Corpus. To address this problem, In this study we proposed a method for extracting Japanese-Spanish Parallel Sentences from Wikipedia using POS tagging and Rule-Based approach. The main focus of this approach is the syntactic features of both languages. Human evaluation was performed over a sample and shows promising results, in comparison with the baseline.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Jessica C. Ramírez (1 paper)
  2. Yuji Matsumoto (52 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.