Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC (1809.10735v2)

Published 27 Sep 2018 in cs.CL

Abstract: We compare three new datasets for question answering: SQuAD 2.0, QuAC, and CoQA, along several of their new features: (1) unanswerable questions, (2) multi-turn interactions, and (3) abstractive answers. We show that the datasets provide complementary coverage of the first two aspects, but weak coverage of the third. Because of the datasets' structural similarity, a single extractive model can be easily adapted to any of the datasets and we show improved baseline results on both SQuAD 2.0 and CoQA. Despite the similarity, models trained on one dataset are ineffective on another dataset, but we find moderate performance improvement through pretraining. To encourage cross-evaluation, we release code for conversion between datasets at https://github.com/my89/co-squac .

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Mark Yatskar (38 papers)
Citations (94)
Github Logo Streamline Icon: https://streamlinehq.com