Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When to Fold'em: How to answer Unanswerable questions (2105.00328v1)

Published 1 May 2021 in cs.CL

Abstract: We present 3 different question-answering models trained on the SQuAD2.0 dataset -- BIDAF, DocumentQA and ALBERT Retro-Reader -- demonstrating the improvement of LLMs in the past three years. Through our research in fine-tuning pre-trained models for question-answering, we developed a novel approach capable of achieving a 2% point improvement in SQuAD2.0 F1 in reduced training time. Our method of re-initializing select layers of a parameter-shared LLM is simple yet empirically powerful.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Marshall Ho (2 papers)
  2. Zhipeng Zhou (32 papers)
  3. Judith He (1 paper)
Citations (2)