Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering (1607.06275v2)

Published 21 Jul 2016 in cs.CL, cs.AI, and cs.NE

Abstract: While question answering (QA) with neural network, i.e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system. To alleviate this problem, we propose a large scale human annotated real-world QA dataset WebQA with more than 42k questions and 556k evidences. As existing neural QA methods resolve QA either as sequence generation or classification/ranking problem, they face challenges of expensive softmax computation, unseen answers handling or separate candidate answer generation component. In this work, we cast neural QA as a sequence labeling problem and propose an end-to-end sequence labeling model, which overcomes all the above challenges. Experimental results on WebQA show that our model outperforms the baselines significantly with an F1 score of 74.69% with word-based input, and the performance drops only 3.72 F1 points with more challenging character-based input.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Peng Li (390 papers)
  2. Wei Li (1122 papers)
  3. Zhengyan He (1 paper)
  4. Xuguang Wang (6 papers)
  5. Ying Cao (30 papers)
  6. Jie Zhou (687 papers)
  7. Wei Xu (536 papers)
Citations (91)

Summary

We haven't generated a summary for this paper yet.