Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Semi-Supervised Approach for Low-Resourced Text Generation (1906.00584v1)

Published 3 Jun 2019 in cs.CL

Abstract: Recently, encoder-decoder neural models have achieved great success on text generation tasks. However, one problem of this kind of models is that their performances are usually limited by the scale of well-labeled data, which are very expensive to get. The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant. In this paper, we propose a method to make use of the unlabeled data to improve the performance of such models in the low-resourced circumstances. We use denoising auto-encoder (DAE) and LLM (LM) based reinforcement learning (RL) to enhance the training of encoder and decoder with unlabeled data. Our method shows adaptability for different text generation tasks, and makes significant improvements over basic text generation models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Hongyu Zang (12 papers)
  2. Xiaojun Wan (99 papers)
Citations (4)