Multi-Reward Reinforced Summarization with Saliency and Entailment (1804.06451v2)

Published 17 Apr 2018 in cs.CL, cs.AI, and cs.LG

Abstract: Abstractive text summarization is the task of compressing and rewriting a long document into a short summary while maintaining saliency, directed logical entailment, and non-redundancy. In this work, we address these three important aspects of a good summary via a reinforcement learning approach with two novel reward functions: ROUGESal and Entail, on top of a coverage-based baseline. The ROUGESal reward modifies the ROUGE metric by up-weighting the salient phrases/words detected via a keyphrase classifier. The Entail reward gives high (length-normalized) scores to logically-entailed summaries using an entailment classifier. Further, we show superior performance improvement when these rewards are combined with traditional metric (ROUGE) based rewards, via our novel and effective multi-reward approach of optimizing multiple rewards simultaneously in alternate mini-batches. Our method achieves the new state-of-the-art results (including human evaluation) on the CNN/Daily Mail dataset as well as strong improvements in a test-only transfer setup on DUC-2002.

PDF Abstract

Multi-Reward Reinforced Summarization with Saliency and Entailment

The paper "Multi-Reward Reinforced Summarization with Saliency and Entailment" by Ramakanth Pasunuru and Mohit Bansal presents a reinforcement learning (RL) framework designed to address key challenges in abstractive text summarization, namely, saliency, entailment, and non-redundancy. The authors propose two novel reward functions, ROUGESal and Entail, integrated with a pre-existing coverage-based model, achieving state-of-the-art results on benchmark datasets like CNN/Daily Mail and demonstrating strong transferability in a test-only setup on DUC-2002.

Methodology and Novel Contributions

The authors explore the abstractive summarization task, emphasizing the need for summaries that not only reduce content length but also highlight salient information, ensure logical consistency with the source text, and avoid redundancy. While coverage-based models have addressed redundancy, the authors argue that saliency and logical entailment remain inadequately tackled.

ROUGESal Reward Function: This novel reward modifies the conventional ROUGE metric by prioritizing key phrases. Saliency weights are determined using a saliency predictor trained on the SQuAD dataset, addressing vital information using answer spans as a proxy for important document content. This approach adapts ROUGE score calculations by differentially weighting tokens based on their saliency probability, which enhances the emphasis on essential information in generated summaries.
Entail Reward Function: For logical entailment, the authors employ an entailment classifier trained on SNLI and Multi-NLI datasets. This component evaluates whether a generated summary can be logically inferred from the key points of the input document. The reward function is modified to penalize short yet seemingly correct summaries through length normalization, ensuring comprehensively accurate summaries.
Multi-Reward Optimization: The authors propose a multi-reward optimization strategy, sidestepping the complexities of reward scaling typically encountered in combining different reward functions. This is achieved by sequentially alternating optimization with different rewards across mini-batches, resembling multi-task learning but within a single task’s objectives.

Results and Implications

The proposed model, particularly when leveraging both ROUGESal and Entail rewards, outperforms baseline models and other previous approaches in terms of both automatic metrics (e.g., ROUGE, METEOR) and human evaluations on relevance and readability. This model makes significant strides in capturing salient information and maintaining logical consistency, validated by substantial improvements in cross-dataset generalization to DUC-2002. This indicates the model’s enhanced transferability to different domains, affording a broader application spectrum.

Theoretical Contributions and Future Directions

The work introduces critical theoretical advancements in the reinforcement learning paradigm for text summarization by successfully integrating multiple rewards that encourage different aspects of summary quality. This not only advances state-of-the-art performance but also paves the way for more nuanced approaches that could be developed by incorporating other linguistic features as reward signals, such as sentiment or factual accuracy.

Future work may explore the integration of semantic information for even richer summarization, extending the entailment logic to encompass pragmatic or contextual aspects of language. Additionally, there are promising avenues in fine-tuning the reward mechanisms and examining other policy-based RL approaches to further optimize learning and model performance.

Overall, Pasunuru and Bansal’s work exemplifies a significant exploration into reinforcement learning’s applicability in comprehensive and intelligent text summarization, setting a precedent for future research in linguistic feature integration in summarization systems.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Ramakanth Pasunuru (32 papers)
Mohit Bansal (304 papers)

Citations (199)

View on Semantic Scholar

Multi-Reward Reinforced Summarization with Saliency and Entailment (1804.06451v2)