MaskGAN: Better Text Generation via Filling in the______ (1801.07736v3)

Published 23 Jan 2018 in stat.ML, cs.AI, and cs.LG

Abstract: Neural text generation models are often autoregressive LLMs or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality of the generated text. Additionally, these models are typically trained via maxi- mum likelihood and teacher forcing. These methods are well-suited to optimizing perplexity but can result in poor sample quality since generating text requires conditioning on sequences of words that may have never been observed at training time. We propose to improve sample quality using Generative Adversarial Networks (GANs), which explicitly train the generator to produce high quality samples and have shown a lot of success in image generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them. We claim that validation perplexity alone is not indicative of the quality of text generated by a model. We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. We show qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.

PDF Abstract

Overview of MaskGAN: Improving Text Generation with GANs

The paper, "MaskGAN: Better Text Generation via Filling in the", presents an innovative approach to neural text generation by leveraging Generative Adversarial Networks (GANs) to improve the quality of generated text. Traditional text generation methods such as autoregressive LLMs and seq2seq models primarily optimize for perplexity, which does not adequately capture the quality of text generation. These models, trained via maximum likelihood and teacher forcing, often generate poor quality samples when conditioned on sequences not encountered during training. This paper addresses these limitations by proposing the use of GANs, specifically through a model called MaskGAN, to train text generators more effectively.

Key Contributions

GANs for Text Generation: The paper explores the challenges of applying GANs, which typically output continuous values, to the discrete domain of text generation. The authors utilize Reinforcement Learning (RL) to backpropagate gradients through discrete variables, a method that has shown success in overcoming this challenge.
MaskGAN Architecture: MaskGAN introduces an innovative text in-filling task. The architecture fills in missing text within a sequence, conditioned on surrounding context, rather than generating text purely autoregressively. This enhances GAN stability by providing more contextual information and reducing mode collapse, which is a known issue in traditional GAN setups.
Actor-Critic Framework: MaskGAN employs an actor-critic approach, using a critic network to estimate a value function, which aids in reducing the variance of gradient estimates during training. This inclusion is shown to substantially improve the robustness and sample quality of text generation.
Evaluation Metrics: The authors argue against solely relying on perplexity for text model evaluation and propose evaluation based on unique n-grams and human judgment. They demonstrate that MaskGAN achieves comparable or superior sample quality without minimizing perplexity.

Experimental Results

The authors provide detailed experimental evaluations on datasets like the Penn Treebank (PTB) and IMDB reviews. MaskGAN's performance is compared against maximum likelihood-trained models and other GAN variants. Notably, human evaluation suggests that MaskGAN-generated samples are grammatically and contextually superior to those from maximum likelihood models.

Implications and Future Directions

MaskGAN sets a new direction for improving text generation by aligning training and inference procedures. It suggests that GANs, when adapted correctly, offer a viable alternative to traditional text generation paradigms. The work highlights the importance of matching model objectives with downstream task requirements.

The paper opens several avenues for future research:

Model Architectures: Exploring architectures such as attention-based models may further improve in-filling models' capacity.
Training Stability: Continued exploration of training algorithms in GANs, particularly those incorporating both RL and continuous approximators, could lead to more stable training procedures.
Broader Applications: Extending MaskGAN to diverse linguistic tasks beyond text in-filling, such as dialogue systems and machine translation, could test and refine its applicability.

Conclusion

By addressing the deficiencies of traditional text generation models, MaskGAN represents a significant step towards integrating GANs in natural language processing tasks. The research emphasizes GANs' potential in generating high-quality text, advocating for further exploration and refinement in this domain.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

William Fedus (25 papers)
Ian Goodfellow (54 papers)
Andrew M. Dai (40 papers)

Citations (463)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos