On Extractive and Abstractive Neural Document Summarization with Transformer Language Models (1909.03186v2)

Published 7 Sep 2019 in cs.CL

Abstract: We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer LLM on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.

PDF Abstract

An Examination of Extractive and Abstractive Neural Document Summarization with Transformer LLMs

The paper investigates the application of transformer LLMs (TLMs) for the task of document summarization, distinguishing between extractive and abstractive approaches. The authors propose a methodology that incorporates both extractive and abstractive elements to enhance the efficacy of summarization for extended textual inputs such as scientific articles.

Methodology

The authors initially address the challenge of handling extensive documents through an innovative two-step strategy:

Extractive Step: Two hierarchical models, namely a sentence pointer network and a sentence classifier, are utilized to identify and extract the most salient sentences from the document. This pragmatic approach serves to condense the document, focusing the transformer model's attention on pertinent content.
Abstractive Step: The extracted sentences condition a transformer LLM to generate a coherent and concise summary. This stage leverages a transformer architecture akin to GPT-like models, which, unlike traditional seq2seq models, do not explicitly divide the problem into encoding and decoding tasks.

Empirical Results

The proposed method was validated against several large datasets, including arXiv, PubMed, and bigPatent, demonstrating superior performance compared to existing extractive and abstractive methods. Noteworthy findings include:

The TLM conditioned on extracted information achieved higher ROUGE scores relative to prior methods, illustrating improved summarization efficacy.
The use of a transformer model without a copying mechanism resulted in summaries that are more abstractive in nature, with minimal copying of verbatim phrases from the original document.

Implications and Future Directions

From a practical perspective, the proposed approach offers enhanced summarization capabilities, which are particularly beneficial for domains requiring processing of long documents such as academic publishing and patent documentation. Theoretically, this work illustrates the potential of combining extractive techniques as a preliminary step to improve the focus and relevance of the contextual input for transformer models in generation tasks.

Future research directions might explore end-to-end training paradigms that tightly integrate extractive and abstractive components, potentially improving efficiency and summary quality. Additionally, there is the open challenge of ensuring factual correctness in generated summaries, particularly significant in scientific contexts where inaccuracies could have substantial ramifications.

In conclusion, this paper contributes to the growing understanding of how transformer architectures can be adapted and applied to long-document summarization tasks, showcasing their ability to produce high-quality, concise, and abstractive summaries when appropriately conditioned on key document elements using extractive methods.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Sandeep Subramanian (24 papers)
Raymond Li (24 papers)
Jonathan Pilault (15 papers)
Christopher Pal (97 papers)

Citations (198)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/sarahdorner/status/1768724916604408000