Keep It Private: Unsupervised Privatization of Online Text (2405.10260v1)
Abstract: Authorship obfuscation techniques hold the promise of helping people protect their privacy in online communications by automatically rewriting text to hide the identity of the original author. However, obfuscation has been evaluated in narrow settings in the NLP literature and has primarily been addressed with superficial edit operations that can lead to unnatural outputs. In this work, we introduce an automatic text privatization framework that fine-tunes a LLM via reinforcement learning to produce rewrites that balance soundness, sense, and privacy. We evaluate it extensively on a large-scale test set of English Reddit posts by 68k authors composed of short-medium length texts. We study how the performance changes among evaluative conditions including authorial profile length and authorship detection strategy. Our method maintains high text quality according to both automated metrics and human evaluation, and successfully evades several automated authorship attacks.
- Rigot Afsaneh. 2021. Why Online Anonymity Matters.
- A multifaceted framework to evaluate evasion, content preservation, and misattribution in authorship obfuscation techniques. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2391–2406, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Nicholas Andrews and Marcus Bishop. 2019. Learning invariant representations of social media users. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1684–1695, Hong Kong, China. Association for Computational Linguistics.
- The pushshift reddit dataset. Proceedings of the International AAAI Conference on Web and Social Media, 14(1):830–839.
- Heuristic authorship obfuscation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1098–1108, Florence, Italy. Association for Computational Linguistics.
- ER-AE: Differentially private text generation for authorship anonymization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3997–4007, Online. Association for Computational Linguistics.
- Deep communicating agents for abstractive summarization. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1662–1675, New Orleans, Louisiana. Association for Computational Linguistics.
- William B. Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005).
- Style obfuscation by invariance. In Proceedings of the 27th International Conference on Computational Linguistics, pages 984–996, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Stylometric analysis of bloggers’ age and gender. Proceedings of the International AAAI Conference on Web and Social Media, 3(1):214–217.
- STEER: Unified style transfer with expert reinforcement. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7546–7562, Singapore. Association for Computational Linguistics.
- David I. Holmes. 1998. The evolution of stylometry in humanities scholarship. Literary and Linguistic Computing, 13:111–117.
- First quora dataset release: Question pairs. Journal of Machine Learning Research.
- Adversarial example generation with syntactically controlled paraphrase networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1875–1885, New Orleans, Louisiana. Association for Computational Linguistics.
- Gary Kacmarcik and Michael Gamon. 2006. Obfuscating document stylometry to preserve author anonymity. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 444–451, Sydney, Australia. Association for Computational Linguistics.
- The case for being average: A mediocrity approach to style masking and author obfuscation - (best of the labs track at clef-2017). In Conference and Labs of the Evaluation Forum.
- Author masking through translation. In Conference and Labs of the Evaluation Forum.
- Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023.
- The summary loop: Learning to write abstractive summaries without examples. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5135–5150, Online. Association for Computational Linguistics.
- Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6365–6378, Online. Association for Computational Linguistics.
- Alex Leavitt. 2015. "This is a Throwaway Account": Temporary Technical Identities and Perceptions of Anonymity in a Massive Online Community. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW ’15, pages 317–327, New York, NY, USA. Association for Computing Machinery.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.
- Tatiana Litvinova. 2020. Stylometrics features under domain shift: Do they really “context-independent”? In Speech and Computer, pages 279–290, Cham. Springer International Publishing.
- Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742.
- A girl has no name: Automated authorship obfuscation using mutant-x. Proceedings on Privacy Enhancing Technologies, 2019:54 – 71.
- Exploring stylometric and emotion-based features for multilingual cross-domain hate speech detection. In Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 149–159, Online. Association for Computational Linguistics.
- Fatemehsadat Mireshghallah and Taylor Berg-Kirkpatrick. 2021. Style pooling: Automatic text style obfuscation for improved classification fairness. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2009–2022, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Low-resource authorship style transfer with in-context learning. ArXiv, abs/2212.08986.
- Anselmo Peñas and Alvaro Rodrigo. 2011. A simple measure to assess non-response. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1415–1424, Portland, Oregon, USA. Association for Computational Linguistics.
- Language models are unsupervised multitask learners.
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
- Self-critical sequence training for image captioning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 1179–1195. IEEE Computer Society.
- Learning Universal Authorship Representations. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 913–919, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Effects of age and gender on blogging. In AAAI spring symposium: Computational approaches to analyzing weblogs, volume 6, pages 199–205.
- A4nt: Author attribute anonymity by adversarial training of neural machine translation. In USENIX Security Symposium.
- Efstathios Stamatatos. 2009. A survey of modern authorship attribution methods. J. Am. Soc. Inf. Sci. Technol., 60(3):538–556.
- Overview of the Authorship Verification Task at PAN 2022. In CLEF 2022 Labs and Workshops, Notebook Papers. CEUR-WS.org.
- Exploring document-level literary machine translation with parallel paragraphs from world literature. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9882–9902, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Context collapse and anonymity among queer Reddit users. New Media & Society, 23(1):5–21.
- Attribution and obfuscation of neural text authorship: A data mining perspective. SIGKDD Explor., 25(1):1–18.
- A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI’18, page 4453–4460. AAAI Press.
- Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7:625–641.
- Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn., 8(3–4):229–256.
- BigScience Workshop. 2023. Bloom: A 176b-parameter open-access multilingual language model.
- Privacy-aware text rewriting. In Proceedings of the 12th International Conference on Natural Language Generation, pages 247–257, Tokyo, Japan. Association for Computational Linguistics.
- Large batch optimization for deep learning: Training BERT in 76 minutes. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
- Shiyue Zhang and Mohit Bansal. 2019. Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2495–2509, Hong Kong, China. Association for Computational Linguistics.
- PAWS: Paraphrase adversaries from word scrambling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1298–1308, Minneapolis, Minnesota. Association for Computational Linguistics.
- Calvin Bao (4 papers)
- Marine Carpuat (56 papers)