Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text (2110.09495v1)

Published 18 Oct 2021 in cs.LG, cs.CL, and cs.CR

Abstract: With Internet users constantly leaving a trail of text, whether through blogs, emails, or social media posts, the ability to write and protest anonymously is being eroded because artificial intelligence, when given a sample of previous work, can match text with its author out of hundreds of possible candidates. Existing approaches to authorship anonymization, also known as authorship obfuscation, often focus on protecting binary demographic attributes rather than identity as a whole. Even those that do focus on obfuscating identity require manual feedback, lose the coherence of the original sentence, or only perform well given a limited subset of authors. In this paper, we develop a new approach to authorship anonymization by constructing a generative adversarial network that protects identity and optimizes for three different losses corresponding to anonymity, fluency, and content preservation. Our fully automatic method achieves comparable results to other methods in terms of content preservation and fluency, but greatly outperforms baselines in regards to anonymization. Moreover, our approach is able to generalize well to an open-set context and anonymize sentences from authors it has not encountered before.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Rishi Balakrishnan (1 paper)
  2. Stephen Sloan (2 papers)
  3. Anil Aswani (49 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.