Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning (2109.04689v1)
Abstract: Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset is used to learn a QA pair generation model producing summaries as answers that balance brevity with sufficiency jointly with their corresponding questions. We then reinforce the QA pair generation process with a differentiable reward function to mitigate exposure bias, a common problem in natural language generation. Both automatic metrics and human evaluation demonstrate these QA pairs successfully capture the central gists of the articles and achieve high answer accuracy.
- Li Zhou (215 papers)
- Kevin Small (15 papers)
- Yong Zhang (660 papers)
- Sandeep Atluri (4 papers)