Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation (2002.10210v1)

Published 24 Feb 2020 in cs.CL

Abstract: In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference. The task is unsupervised due to lack of parallel data, and is challenging to select suitable records and style words from bi-aspect inputs respectively and generate a high-fidelity long document. To tackle those problems, we first build a dataset based on a basketball game report corpus as our testbed, and present an unsupervised neural model with interactive attention mechanism, which is used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation. In addition, we also explore the effectiveness of the back-translation in our task for constructing some pseudo-training pairs. Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Xiaocheng Feng (54 papers)
  2. Yawei Sun (4 papers)
  3. Bing Qin (186 papers)
  4. Heng Gong (5 papers)
  5. Yibo Sun (12 papers)
  6. Wei Bi (62 papers)
  7. Xiaojiang Liu (27 papers)
  8. Ting Liu (329 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.