Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation (2012.15482v1)

Published 31 Dec 2020 in cs.CL

Abstract: Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for NLP tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex, which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy, on multiple tasks from the ERASER explainability benchmark, both in the fully supervised and in the few-shot settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Kushal Lakhotia (15 papers)
  2. Bhargavi Paranjape (18 papers)
  3. Asish Ghoshal (14 papers)
  4. Wen-tau Yih (84 papers)
  5. Yashar Mehdad (37 papers)
  6. Srinivasan Iyer (20 papers)
Citations (25)

Summary

We haven't generated a summary for this paper yet.