Learning to Extract Coherent Summary via Deep Reinforcement Learning (1804.07036v1)

Published 19 Apr 2018 in cs.CL

Abstract: Coherence plays a critical role in producing a high-quality summary from a document. In recent years, neural extractive summarization is becoming increasingly attractive. However, most of them ignore the coherence of summaries when extracting sentences. As an effort towards extracting coherent summaries, we propose a neural coherence model to capture the cross-sentence semantic and syntactic coherence patterns. The proposed neural coherence model obviates the need for feature engineering and can be trained in an end-to-end fashion using unlabeled data. Empirical results show that the proposed neural coherence model can efficiently capture the cross-sentence coherence patterns. Using the combined output of the neural coherence model and ROUGE package as the reward, we design a reinforcement learning method to train a proposed neural extractive summarizer which is named Reinforced Neural Extractive Summarization (RNES) model. The RNES model learns to optimize coherence and informative importance of the summary simultaneously. Experimental results show that the proposed RNES outperforms existing baselines and achieves state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The qualitative evaluation indicates that summaries produced by RNES are more coherent and readable.

Authors (2)

Yuxiang Wu (27 papers)
Baotian Hu (67 papers)

Citations (163)

View on Semantic Scholar

Summary

Understanding the Extraction of Coherent Summaries via Deep Reinforcement Learning

The paper "Learning to Extract Coherent Summary via Deep Reinforcement Learning" by Yuxiang Wu and Baotian Hu delineates a novel approach towards extractive summarization, focusing on the integration of coherence in summaries through reinforcement learning (RL). The authors target the perennial challenge of producing coherent and informative summaries from extended documents—a task that remains unresolved despite significant advancements in deep neural networks (DNNs) within NLP.

Summary of Contributions

This work presents two main contributions to the field:

Neural Coherence Model: By leveraging distributed sentence representations in an end-to-end manner, the neural coherence model bypasses traditional feature engineering, notably eschewing the need for entity recognition systems prevalent in entity grid models. This model applies convolutional and max-pooling layers to capture discourse relations and local entity transitions across sentences.
Reinforced Neural Extractive Summarization (RNES) Model: The paper introduces RNES, a novel framework that employs RL to optimize both coherence and informative importance. Coherence scores generated by the neural coherence model, combined with ROUGE metrics, form a comprehensive reward structure. The RNES outperforms several sophisticated baselines, setting state-of-the-art performance benchmarks on the CNN/Daily Mail dataset, primarily in ROUGE scoring.

Technical Insight

The proposed approach hinges on two interlinked models: a neural coherence model and RNES. The hierarchical architecture of RNES employs convolutional neural networks at the word level, a Bi-GRU for sentence-level context modeling, and an MLP to decide sentence extraction based on contextual data. During RL training, RNES utilizes coherence scores and ROUGE measures to balance coherence and informativeness, targeting problems prevalent in traditional extractive summarization methods, such as incongruence between sentences and lack of semantic preservation.

Empirical Evaluation

The experiments detail the efficacy of the RNES model. Notably, incorporating the coherence reward into the RL framework leads to summaries that, according to qualitative evaluation, are not only more coherent but also more informative than those generated without coherence consideration. The RNES structure ensures that coherence does not come at the cost of informativeness, a crucial consideration given that ROUGE-based evaluations admittedly do not assess sentence coherence.

Implications and Future Work

This research holds significant implications for both practical applications and theoretical exploration. On a practical front, the capacity to consistently generate coherent and informative summaries could transform content curation and information retrieval systems. Theoretically, it opens avenues for further refinements in coherence modeling—a necessity for effective summarization—without reliance on traditional models that demand heavy feature engineering.

The paper suggests potential improvements in the neural coherence model itself to enhance overall summarization performance. Additionally, the authors hint at the benefit of integrating human knowledge into RNES, perhaps through hybrid AI systems that combine the predictiveness of algorithms with human insight or context-awareness.

In conclusion, the integration of coherence modeling into reinforcement learning for extractive summarization introduces a robust framework capable of generating high-quality summaries. While the paper sets a new benchmark in contextually-aware sentence extraction, it also lays the groundwork for future exploration into hybrid systems and the seamless incorporation of human expertise into automated summarization pathways.