2000 character limit reached
Towards Unsupervised Grammatical Error Correction using Statistical Machine Translation with Synthetic Comparable Corpus (1907.09724v1)
Published 23 Jul 2019 in cs.CL
Abstract: We introduce unsupervised techniques based on phrase-based statistical machine translation for grammatical error correction (GEC) trained on a pseudo learner corpus created by Google Translation. We verified our GEC system through experiments on various GEC dataset, includi ng a low resource track of the shared task at Building Educational Applications 2019 (BEA 2019). As a result, we achieved an F_0.5 score of 28.31 points with the test data of the low resource track.
- Satoru Katsumata (7 papers)
- Mamoru Komachi (40 papers)