Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reinforcement Learning with Token-level Feedback for Controllable Text Generation (2403.11558v1)

Published 18 Mar 2024 in cs.CL and cs.AI

Abstract: To meet the requirements of real-world applications, it is essential to control generations of LLMs. Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a "first-quantize-then-noise" paradigm to enhance the robustness of the RL algorithm.Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Wendi Li (11 papers)
  2. Wei Wei (424 papers)
  3. Kaihe Xu (2 papers)
  4. Wenfeng Xie (8 papers)
  5. Dangyang Chen (20 papers)
  6. Yu Cheng (354 papers)
Citations (2)