Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension (1909.07005v2)

Published 16 Sep 2019 in cs.CL

Abstract: Machine Reading Comprehension (MRC) is a task that requires machine to understand natural language and answer questions by reading a document. It is the core of automatic response technology such as chatbots and automatized customer supporting systems. We present Korean Question Answering Dataset(KorQuAD), a large-scale Korean dataset for extractive machine reading comprehension task. It consists of 70,000+ human generated question-answer pairs on Korean Wikipedia articles. We release KorQuAD1.0 and launch a challenge at https://KorQuAD.github.io to encourage the development of multilingual natural language processing research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Seungyoung Lim (1 paper)
  2. Myungji Kim (1 paper)
  3. Jooyoul Lee (1 paper)
Citations (85)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub