Learning to Generate Text in Arbitrary Writing Styles (2312.17242v2)

Published 28 Dec 2023 in cs.CL

Abstract: Prior work in style-controlled text generation has focused on tasks such as emulating the style of prolific literary authors, producing formal or informal text, and mitigating toxicity of generated text. Plentiful demonstrations of these styles are available, and as a result modern LLMs are often able to emulate them, either via prompting or discriminative control. However, in applications such as writing assistants, it is desirable for LLMs to produce text in an author-specific style on the basis of a potentially small writing sample. For example, someone writing in a particular dialect may prefer writing suggestions that retain the same dialect. We find that instruction-tuned LLMs can struggle to reproduce author-specific style demonstrated in a prompt. Instead, we propose to guide a LLM to generate text in a target style using contrastively-trained representations that capture stylometric features. Our approach (StyleMC) combines an author-adapted LLM with sequence-level inference to improve stylistic consistency, and is found to be effective in a variety of conditions, including unconditional generation and style transfer. Additionally, we find that the proposed approach can serve as an effective anonymization method, by editing a document to mask authorship while preserving the original meaning

References (44)

Authors (4)

Aleem Khan (6 papers)
Andrew Wang (42 papers)
Sophia Hager (4 papers)
Nicholas Andrews (22 papers)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning to Generate Text in Arbitrary Writing Styles (2312.17242v2)

Summary

Related Papers