Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ANNA: Enhanced Language Representation for Question Answering (2203.14507v2)

Published 28 Mar 2022 in cs.CL

Abstract: Pre-trained LLMs have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-the-art results have shown their approaches in the separate perspectives of data processing, pre-training tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance individually, and that the LLM performs the best results on a specific question answering task when those approaches are jointly considered in pre-training models. In particular, we propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training LLMing. Our best model achieves new state-of-the-art results of 95.7\% F1 and 90.6\% EM on SQuAD 1.1 and also outperforms existing pre-trained LLMs such as RoBERTa, ALBERT, ELECTRA, and XLNet on the SQuAD 2.0 benchmark.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Changwook Jun (4 papers)
  2. Hansol Jang (5 papers)
  3. Myoseop Sim (2 papers)
  4. Hyun Kim (17 papers)
  5. Jooyoung Choi (21 papers)
  6. Kyungkoo Min (2 papers)
  7. Kyunghoon Bae (17 papers)
Citations (5)