Extractive Summarization as Text Matching (2004.08795v1)

Published 19 Apr 2020 in cs.CL

Abstract: This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. Instead of following the commonly used framework of extracting sentences individually and modeling the relationship between sentences, we formulate the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be (extracted from the original text) matched in a semantic space. Notably, this paradigm shift to semantic matching framework is well-grounded in our comprehensive analysis of the inherent gap between sentence-level and summary-level extractors based on the property of the dataset. Besides, even instantiating the framework with a simple form of a matching model, we have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1). Experiments on the other five datasets also show the effectiveness of the matching framework. We believe the power of this matching-based summarization framework has not been fully exploited. To encourage more instantiations in the future, we have released our codes, processed dataset, as well as generated summaries in https://github.com/maszhongming/MatchSum.

PDF Abstract

Extractive Summarization as Text Matching

The paper "Extractive Summarization as Text Matching" redefines the conventional methodologies utilized in neural extractive summarization systems. It approaches extractive summarization as a semantic text matching issue, proposing a paradigm shift from traditional sentence-level extractors to a summary-level framework. This novel approach leverages semantic proximity in embeddings to achieve improved summarization efficacy.

Framework and Methodology

The proposed framework, named MatchSum, treats the problem of extractive summarization as one that involves matching documents and candidate summaries in a shared semantic space. This approach contrasts with the traditional method of extracting and linking sentences individually, which often results in redundancy and suboptimal summaries. The semantic matching is demonstrated to align more closely with the entire summary context rather than individual sentence metrics.

A pivotal component of the framework is the instantiation using a simple Siamese-BERT model. By employing the pre-trained BERT architecture, the model captures semantic representations of documents and candidate summaries, utilizing cosine similarity to assess their closeness. The use of BERT ensures that the representations are rich in contextual nuances, a factor critical for reliable semantic matching.

The Siamese-BERT is trained with a margin-based triplet loss that emphasizes two principles: maximizing the semantic similarity of a gold-standard summary to its source document, and distinguishing relevant candidate summaries via their relative similarity rankings.

Experimental Evaluation

Experiments across six datasets confirm substantial improvements in summarization performance. Specifically, on the CNN/DailyMail dataset, the framework achieved a state-of-the-art ROUGE-1 score of 44.41 using RoBERTa-base, marking a significant improvement over existing models, including sentence-level techniques enhanced with redundancy removal methods like Trigram Blocking.

Further evaluations on datasets with varying summary lengths and domains—such as XSum and Multi-News—highlight the flexibility and robustness of the MatchSum framework. It proves particularly effective on datasets where medium-length summaries are required, demonstrating a marked ability to identify and leverage pearl-summaries, those summaries that are not optimal when assessed via sentence-level scores but excel in summary-level evaluations.

Implications and Future Directions

The introduction of a semantic-level matching framework in extractive summarization signifies a potential shift in how summarization models are conceptualized. This method not only bypasses the limitations of sentence-level extractors but also provides a scalable approach that can be extended to different text domains with minimal modifications.

Moving forward, further exploration into different architectural instantiations of the matching paradigm could yield even more powerful summarization models. The computational cost of candidate summary evaluation, mitigated through the authors' candidate pruning strategy, remains an area for optimization and innovation.

Additionally, advancing this model could involve integrating more nuanced semantic models or dynamically adjusting the semantic space to better suit specific datasets, particularly those with longer summaries or more complex semantic architectures.

Overall, this paper lays critical groundwork for further advancements in text summarization, encouraging a shift toward semantics-driven extractive methodologies that could redefine the landscape of automated text comprehension and abstraction.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Ming Zhong (88 papers)
Pengfei Liu (191 papers)
Yiran Chen (176 papers)
Danqing Wang (37 papers)
Xipeng Qiu (257 papers)
Xuanjing Huang (287 papers)

Citations (437)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - maszhongming/MatchSum: Code for ACL 2020 paper: "Extractive Summarization as Text Matching" (520 stars)