Specializing Small Language Models towards Complex Style Transfer via Latent Attribute Pre-Training (2309.10929v1)
Abstract: In this work, we introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact. While LLMs (LLM) have shown promise in complex text style transfer, they have drawbacks such as data privacy concerns, network instability, and high deployment costs. To address these issues, we explore the effectiveness of small models (less than T5-3B) with implicit style pre-training through contrastive learning. We also propose a method for automated evaluation of text generation quality based on alignment with human evaluations using ChatGPT. Finally, we compare our approach with existing methods and show that our model achieves state-of-art performances of few-shot text style transfer models.
- Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing, 2021.
- Hooks in the headline: Learning to generate headlines with controlled styles, 2020.
- A persona-based neural conversation model, 2016.
- Automatically neutralizing subjective bias in text, 2019.
- Exploring the limits of transfer learning with a unified text-to-text transformer, 2019.
- Style transfer from non-parallel text by cross-alignment, 2017.
- Adapting language models for non-parallel author-stylized rewriting, 2020.
- Barlow twins: Self-supervised learning via redundancy reduction, 2021.