Neural Document Summarization by Jointly Learning to Score and Select Sentences (1807.02305v1)

Published 6 Jul 2018 in cs.CL

Abstract: Sentence scoring and sentence selection are two main steps in extractive document summarization systems. However, previous works treat them as two separated subtasks. In this paper, we present a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences. It first reads the document sentences with a hierarchical encoder to obtain the representation of sentences. Then it builds the output summary by extracting sentences one by one. Different from previous methods, our approach integrates the selection strategy into the scoring model, which directly predicts the relative importance given previously selected sentences. Experiments on the CNN/Daily Mail dataset show that the proposed framework significantly outperforms the state-of-the-art extractive summarization models.

Authors (6)

Qingyu Zhou (28 papers)
Nan Yang (182 papers)
Furu Wei (291 papers)
Shaohan Huang (79 papers)
Ming Zhou (182 papers)
Tiejun Zhao (70 papers)

Citations (311)

View on Semantic Scholar

Summary

Neural Document Summarization by Jointly Learning to Score and Select Sentences: A Review

The paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences" introduces the Neural Extractive Summarization (NeuSum) framework that integrates sentence scoring and selection into a unified process. This methodology moves away from the traditional bifurcation of scoring and selection tasks, fostering a more cohesive and context-aware extraction of informative content from documents.

Framework Overview

The NeuSum model is novel in its end-to-end neural architecture, which simultaneously optimizes for both scoring and selection of sentences. The model employs a hierarchical encoder to create sentence representations, followed by a recurrent neural network (RNN) that aids in selecting sentences iteratively. Key to its design is the incorporation of previous selection history in the scoring process, allowing the model to predict the importance of a sentence within the broader context of a partially built summary.

Methodological Innovations

Hierarchical Encoding: The architecture captures the hierarchical nature of documents, encoding them at both sentence and document levels using Bi-directional Gated Recurrent Units (BiGRU).
Joint Learning: Unlike conventional systems that decouple scoring and selection, NeuSum’s design ensures these processes are interdependent. This integration facilitates the model’s capacity to assess sentences based on their novelty and contribution to the evolving summary.
End-to-End Training: NeuSum bypasses the necessity for handcrafted features, relying on its neural architecture to learn salient patterns directly from data, aligning with modern machine learning paradigms.
Gain-based Scoring: The model learns a scoring function optimized to enhance Rouge F1 metrics, crucial in determining the incremental informative value of a sentence given its predecessors.

Experimental Validation

The experimental validation on the CNN/Daily Mail dataset demonstrates NeuSum's superiority over existing extractive summarization benchmarks. It achieves a Rouge-2 F1 score of 19.01, surpassing baseline models such as LEAD3 and NN-SE, and even more sophisticated systems like SummaRuNNer and CRSum. The model also reduces redundancy in selected summaries, a testament to its effective gain-based sentence evaluation.

Implications and Future Directions

NeuSum provides significant implications for the development of summarization systems by showing how the integration of scoring and selection can enhance extraction quality. Its ability to dynamically adapt based on previously selected content may pave the way for more nuanced and context-sensitive summarization frameworks.

Future research could explore extending the NeuSum framework to abstractive summarization tasks, where the interdependence of content generation and evaluation is even more pronounced. Additionally, investigating the application of NeuSum in multi-document summarization and exploring its adaptability to non-news domains could yield fruitful insights.

By advancing the field of extractive summarization through joint learning mechanisms, this work offers promising avenues for improving automatic text summarization systems with potential applications across various domains in natural language understanding and information extraction.

PDF Markdown

Related Papers

Find Related Papers