Generative Adversarial Network for Abstractive Text Summarization (1711.09357v1)

Published 26 Nov 2017 in cs.CL and cs.AI

Abstract: In this paper, we propose an adversarial process for abstractive text summarization, in which we simultaneously train a generative model G and a discriminative model D. In particular, we build the generator G as an agent of reinforcement learning, which takes the raw text as input and predicts the abstractive summarization. We also build a discriminator which attempts to distinguish the generated summary from the ground truth summary. Extensive experiments demonstrate that our model achieves competitive ROUGE scores with the state-of-the-art methods on CNN/Daily Mail dataset. Qualitatively, we show that our model is able to generate more abstractive, readable and diverse summaries.

PDF Abstract

Generative Adversarial Network for Abstractive Text Summarization

The paper Generative Adversarial Network for Abstractive Text Summarization presents an innovative framework for generating abstractive summaries utilizing a Generative Adversarial Network (GAN) approach. This model comprises two principal components: a generator capable of producing summaries from input text, and a discriminator dedicated to differentiating machine-generated summaries from human-authored ones. The implementation of these adversarial models seeks to mitigate prevalent challenges in abstractive summarization, such as the generation of trivial summaries and exposure bias.

Technical Overview

Generative Model: The generator is developed as a reinforcement learning agent employing a bi-directional LSTM encoder to process input text into hidden states. An attention-based LSTM decoder is subsequently employed to generate respective summaries. The integration of switching pointer-generator networks enhances the generator's flexibility by allowing word generation from either fixed vocabulary or by pointer copying from the input sequence.
Discriminative Model: The discriminator functions as a binary classifier that utilizes a CNN architecture for text classification, effectively differentiating machine-generated summaries from human ones through multiple filters and max-over-time pooling operations.
Optimization Strategy: The adversarial training involves pre-training both models and subsequently iterating between updates based on adversarial feedback. The generator is optimized via a loss function combining policy gradient and maximum-likelihood loss.

Experimental Results

The experimental evaluation employs the CNN/Daily Mail dataset for training and validation, showcasing the model's competitive performance against established summarization models such as ABS, PGC, and DeepRL. The results exhibit notable improvements in ROUGE metrics—ROUGE-1, ROUGE-2, and ROUGE-L—demonstrating the effectiveness of adversarial training in enhancing the quality and readability of generated summaries.

Implications and Future Directions

The research underscores the potential of GANs in refining abstractive summarization methodologies by addressing issues like exposure bias and evaluation metric discrepancies. While the model demonstrates significant quantitative improvements, further qualitative assessments and refinements could be explored to optimize its applicability across diverse text types and domains. Future research may investigate incorporating advanced neural architectures or integrating semantically rich features to bolster the generator's contextual understanding and output quality. Moreover, leveraging progressively larger and varied datasets could broadbase the model’s generalization capabilities.

In conclusion, this paper contributes valuable insights into the field of abstractive summarization by introducing an adversarial framework that enhances generative performance and output quality. It sets a promising foundation for continued exploration and improvement of GAN-based approaches within natural language processing.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Linqing Liu (11 papers)
Yao Lu (212 papers)
Min Yang (239 papers)
Qiang Qu (33 papers)
Jia Zhu (41 papers)
Hongyan Li (37 papers)

Citations (165)

View on Semantic Scholar

Generative Adversarial Network for Abstractive Text Summarization (1711.09357v1)