Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-CrossRE A Multi-Lingual Multi-Domain Dataset for Relation Extraction (2305.10985v1)

Published 18 May 2023 in cs.CL

Abstract: Most research in Relation Extraction (RE) involves the English language, mainly due to the lack of multi-lingual resources. We propose Multi-CrossRE, the broadest multi-lingual dataset for RE, including 26 languages in addition to English, and covering six text domains. Multi-CrossRE is a machine translated version of CrossRE (Bassignana and Plank, 2022), with a sub-portion including more than 200 sentences in seven diverse languages checked by native speakers. We run a baseline model over the 26 new datasets and--as sanity check--over the 26 back-translations to English. Results on the back-translated data are consistent with the ones on the original English CrossRE, indicating high quality of the translation and the resulting dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Elisa Bassignana (14 papers)
  2. Filip Ginter (28 papers)
  3. Sampo Pyysalo (23 papers)
  4. Rob van der Goot (38 papers)
  5. Barbara Plank (130 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.