ANNA: Enhanced Language Representation for Question Answering (2203.14507v2)

Published 28 Mar 2022 in cs.CL

Abstract: Pre-trained LLMs have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-the-art results have shown their approaches in the separate perspectives of data processing, pre-training tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance individually, and that the LLM performs the best results on a specific question answering task when those approaches are jointly considered in pre-training models. In particular, we propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training LLMing. Our best model achieves new state-of-the-art results of 95.7\% F1 and 90.6\% EM on SQuAD 1.1 and also outperforms existing pre-trained LLMs such as RoBERTa, ALBERT, ELECTRA, and XLNet on the SQuAD 2.0 benchmark.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Changwook Jun (4 papers)
Hansol Jang (5 papers)
Myoseop Sim (2 papers)
Hyun Kim (17 papers)
Jooyoung Choi (21 papers)
Kyungkoo Min (2 papers)
Kyunghoon Bae (17 papers)

Citations (5)

View on Semantic Scholar

ANNA: Enhanced Language Representation for Question Answering (2203.14507v2)

Related Papers