Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages (2207.00758v1)

Published 2 Jul 2022 in cs.CL

Abstract: We present the results of the Workshop on Multilingual Information Access (MIA) 2022 Shared Task, evaluating cross-lingual open-retrieval question answering (QA) systems in 16 typologically diverse languages. In this task, we adapted two large-scale cross-lingual open-retrieval QA datasets in 14 typologically diverse languages, and newly annotated open-retrieval QA data in 2 underrepresented languages: Tagalog and Tamil. Four teams submitted their systems. The best system leveraging iteratively mined diverse negative examples and larger pretrained models achieves 32.2 F1, outperforming our baseline by 4.5 points. The second best system uses entity-aware contextualized representations for document retrieval, and achieves significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Akari Asai (35 papers)
  2. Shayne Longpre (49 papers)
  3. Jungo Kasai (38 papers)
  4. Chia-Hsuan Lee (12 papers)
  5. Rui Zhang (1138 papers)
  6. Junjie Hu (111 papers)
  7. Ikuya Yamada (22 papers)
  8. Jonathan H. Clark (17 papers)
  9. Eunsol Choi (76 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.