Ranking Sentences for Extractive Summarization with Reinforcement Learning (1802.08636v2)

Published 23 Feb 2018 in cs.CL

Abstract: Single document summarization is the task of producing a shorter version of a document while preserving its principal information content. In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate experimentally that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans.

Authors (3)

Shashi Narayan (35 papers)
Shay B. Cohen (78 papers)
Mirella Lapata (135 papers)

Citations (534)

View on Semantic Scholar

Summary

Reinforcement Learning for Extractive Summarization: A Sentence Ranking Approach

The paper, "Ranking Sentences for Extractive Summarization with Reinforcement Learning," offers a significant contribution to the field of automated text summarization by framing extractive summarization as a sentence ranking task. The research advances beyond traditional cross-entropy-based training by directly optimizing the ROUGE evaluation metric using reinforcement learning (RL), specifically employing a policy gradient approach. This methodological shift allows for greater alignment with summarization evaluation criteria and better performance on summarization tasks.

Methodology

The proposed model consists of a hierarchical architecture featuring a sentence encoder, a document encoder, and a sentence extractor. Key elements of the sentence encoder involve convolutional neural networks (CNNs) that generate sentence representations by capturing salient information patterns. Sentences are composed into document representations through a recurrent neural network with LSTM cells, considering both local and global sentence importances.

Central to the paper is the claim that standard cross-entropy loss generates verbose and less informative summaries due to a misalignment with evaluation metrics like ROUGE. This paper addresses the disconnect by utilizing RL to globally optimize sentence selection, thus improving the relevance of extracted sentences. The reinforcement learning framework employed is the REINFORCE algorithm, which optimizes a reward function based on ROUGE. This allows the model to better discriminate among sentences, assigning higher ranks to those that frequently appear in high-scoring summaries.

Experimental Results

The research demonstrates substantial improvements over other models with both automatic and human evaluations. On the widely-used CNN and DailyMail datasets, the proposed method, named Refresh, consistently outperforms both extractive and abstractive state-of-the-art systems. Notably, human evaluations corroborate these findings, underscoring Refresh’s efficacy in producing summaries more informative and coherent than those generated by leading abstractive models.

Implications and Future Directions

This paper's approach and results underscore a shift in how automatic summarization can be enhanced through direct metric optimization. By utilizing RL to align learning with evaluation criteria, it addresses key discrepancies in conventional method training. The findings imply practical improvements in tasks necessitating concise and relevant document representations, such as news aggregation and content curation.

Future research could explore further refinements in sentence ranking, perhaps incorporating more nuanced discourse units beyond sentences. Extensions to this work might involve integrating compression techniques within the RL framework or adapting the approach for multi-document summarization scenarios, given its promising single-document results.

Overall, this paper provides a robust framework for employing reinforcement learning in extractive summarization, demonstrating clear improvements over conventional methods by harmonizing model training with the desired evaluative outcomes.

PDF Markdown

Related Papers

Find Related Papers