Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations (1702.03964v1)

Published 13 Feb 2017 in cs.CL

Abstract: The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Lasha Abzianidze (16 papers)
  2. Johannes Bjerva (52 papers)
  3. Kilian Evang (4 papers)
  4. Hessel Haagsma (5 papers)
  5. Rik van Noord (17 papers)
  6. Pierre Ludmann (1 paper)
  7. Duc-Duy Nguyen (1 paper)
  8. Johan Bos (27 papers)
Citations (155)

Summary

We haven't generated a summary for this paper yet.