Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training (2012.01414v1)

Published 2 Dec 2020 in cs.CL, cs.AI, and cs.IR

Abstract: End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages. Recent work has successfully trained neural IR systems using only supervised question answering (QA) examples from open-domain datasets. However, despite impressive performance on Wikipedia, neural IR lags behind traditional term matching approaches such as BM25 in more specific and specialized target domains such as COVID-19. Furthermore, given little or no labeled data, effective adaptation of QA systems can also be challenging in such target domains. In this work, we explore the application of synthetically generated QA examples to improve performance on closed-domain retrieval and MRC. We combine our neural IR and MRC systems and show significant improvements in end-to-end QA on the CORD-19 collection over a state-of-the-art open-domain QA baseline.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Revanth Gangi Reddy (25 papers)
  2. Bhavani Iyer (6 papers)
  3. Md Arafat Sultan (25 papers)
  4. Rong Zhang (133 papers)
  5. Avi Sil (2 papers)
  6. Vittorio Castelli (24 papers)
  7. Radu Florian (54 papers)
  8. Salim Roukos (41 papers)
Citations (19)