Unsupervised Text Embedding Space Generation Using Generative Adversarial Networks for Text Synthesis (2306.17181v4)
Abstract: Generative Adversarial Networks (GAN) is a model for data synthesis, which creates plausible data through the competition of generator and discriminator. Although GAN application to image synthesis is extensively studied, it has inherent limitations to natural language generation. Because natural language is composed of discrete tokens, a generator has difficulty updating its gradient through backpropagation; therefore, most text-GAN studies generate sentences starting with a random token based on a reward system. Thus, the generators of previous studies are pre-trained in an autoregressive way before adversarial training, causing data memorization that synthesized sentences reproduce the training data. In this paper, we synthesize sentences using a framework similar to the original GAN. More specifically, we propose Text Embedding Space Generative Adversarial Networks (TESGAN) which generate continuous text embedding spaces instead of discrete tokens to solve the gradient backpropagation problem. Furthermore, TESGAN conducts unsupervised learning which does not directly refer to the text of the training data to overcome the data memorization issue. By adopting this novel method, TESGAN can synthesize new sentences, showing the potential of unsupervised learning for text synthesis. We expect to see extended research combining LLMs with a new perspective of viewing text as an continuous space.
- Flamingo: a visual language model for few-shot learning.
- Jointly measuring diversity and quality in text generation models. In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation, pages 90–98, Minneapolis, Minnesota. Association for Computational Linguistics.
- Data augmentation generative adversarial networks.
- Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR.
- Gan augmentation: Augmenting training data using generative adversarial networks.
- Generating sentences from a continuous space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 10–21, Berlin, Germany. Association for Computational Linguistics.
- Language gans falling short. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
- Maximum-likelihood augmented discrete generative adversarial networks.
- Adversarial text generation via feature-mover’s distance. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 4671–4682, Red Hook, NY, USA. Curran Associates Inc.
- Scaling instruction-finetuned language models.
- Instructblip: Towards general-purpose vision-language models with instruction tuning.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- TILGAN: Transformer-based implicit latent GAN for diverse and coherent text generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 4844–4858, Online. Association for Computational Linguistics.
- Maskgan: Better text generation via filling in the _______. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.
- Generative adversarial nets. In Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc.
- Graves, Alex. 2013. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850.
- Long text generation via adversarial training with leaked information. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop.
- Hochreiter, Sepp and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, 9(8):1735–1780.
- Categorical reparameterization with gumbel-softmax. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
- Alias-free generative adversarial networks.
- A style-based generator architecture for generative adversarial networks.
- Kingma, Diederik P. and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.
- Kingma, Diederik P. and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings.
- Kullback, S. and R. A. Leibler. 1951. On Information and Sufficiency. The Annals of Mathematical Statistics, 22(1):79 – 86.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models.
- DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986–995, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- Lin, Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Adversarial ranking for language generation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 3158–3168, Red Hook, NY, USA. Curran Associates Inc.
- Dying relu and initialization: Theory and numerical examples. ArXiv, abs/1903.06733.
- Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142–150, Portland, Oregon, USA. Association for Computational Linguistics.
- Rectifier nonlinearities improve neural network acoustic models. In in ICML Workshop on Deep Learning for Audio, Speech and Language Processing.
- Training language gans from scratch. In Neural Information Processing Systems.
- Relgan: Relational generative adversarial networks for text generation. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
- Training language models to follow instructions with human feedback.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
- Language generation with recurrent generative adversarial networks without pre-training. CoRR, abs/1706.01399.
- Unsupervised representation learning with deep convolutional generative adversarial networks.
- Language models are unsupervised multitask learners.
- Data augmentation using generative adversarial networks (cyclegan) to improve generalizability in ct segmentation tasks. Scientific Reports, 9.
- Relational recurrent neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 7310–7321, Red Hook, NY, USA. Curran Associates Inc.
- Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715–1725, Berlin, Germany. Association for Computational Linguistics.
- Llama: Open and efficient foundation language models.
- On data augmentation for gan training. Trans. Img. Proc., 30:1882–1897.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Reltanh: An activation function with vanishing gradient resistance for sae-based dnns and its application to rotating machinery fault diagnosis. Neurocomputing, 363:88–98.
- Finetuned language models are zero-shot learners. CoRR, abs/2109.01652.
- Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1711–1721, Lisbon, Portugal. Association for Computational Linguistics.
- Williams, Ronald J. and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2):270–280.
- Seqgan: Sequence generative adversarial nets with policy gradient. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1).
- Adversarial feature matching for text generation. In International Conference on Machine Learning, pages 4006–4015. PMLR.
- Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR ’18, page 1097–1100, New York, NY, USA. Association for Computing Machinery.
- Jun-Min Lee (2 papers)
- Tae-Bin Ha (1 paper)