Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator (2212.10218v2)

Published 20 Dec 2022 in cs.CL

Abstract: Pre-trained models have achieved remarkable success in NLP. However, existing pre-training methods underutilize the benefits of language understanding for generation. Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model. Our model, named as GanLM, is trained with two pre-training objectives: replaced token detection and replaced token denoising. Specifically, given masked source sentences, the generator outputs the target distribution and the discriminator predicts whether the target sampled tokens from distribution are incorrect. The target sentence is replaced with misclassified tokens to construct noisy previous context, which is used to generate the gold sentence. In general, both tasks improve the ability of language understanding and generation by selectively using the denoising data. Extensive experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained LLMs (PLMs) and achieves state-of-the-art performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Jian Yang (505 papers)
  2. Shuming Ma (83 papers)
  3. Li Dong (154 papers)
  4. Shaohan Huang (79 papers)
  5. Haoyang Huang (27 papers)
  6. Yuwei Yin (21 papers)
  7. Dongdong Zhang (79 papers)
  8. Liqun Yang (18 papers)
  9. Furu Wei (291 papers)
  10. Zhoujun Li (122 papers)
Citations (24)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub