Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey of Text Watermarking in the Era of Large Language Models (2312.07913v6)

Published 13 Dec 2023 in cs.CL

Abstract: Text watermarking algorithms are crucial for protecting the copyright of textual content. Historically, their capabilities and application scenarios were limited. However, recent advancements in LLMs have revolutionized these techniques. LLMs not only enhance text watermarking algorithms with their advanced abilities but also create a need for employing these algorithms to protect their own copyrights or prevent potential misuse. This paper conducts a comprehensive survey of the current state of text watermarking technology, covering four main aspects: (1) an overview and comparison of different text watermarking techniques; (2) evaluation methods for text watermarking algorithms, including their detectability, impact on text or LLM quality, robustness under target or untargeted attacks; (3) potential application scenarios for text watermarking technology; (4) current challenges and future directions for text watermarking. This survey aims to provide researchers with a thorough understanding of text watermarking technology in the era of LLM, thereby promoting its further advancement.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (104)
  1. Sahar Abdelnabi and Mario Fritz. 2021. Adversarial watermarking transformer: Towards tracing text provenance with data hiding. In 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 121–140.
  2. Concise analysis of current text automation and watermarking approaches. Security and Communication Networks 9, 18 (2016), 6365–6378.
  3. Natural language watermarking: Design, analysis, and a proof-of-concept implementation. In Information Hiding: 4th International Workshop, IH 2001 Pittsburgh, PA, USA, April 25–27, 2001 Proceedings 4. Springer, 185–200.
  4. Mahbuba Begum and Mohammad Shorif Uddin. 2020. Digital image watermarking techniques: a review. Information 11, 2 (2020), 110.
  5. Bad characters: Imperceptible nlp attacks. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 1987–2004.
  6. Electronic marking and identification techniques to discourage document copying. IEEE Journal on Selected Areas in Communications 13, 8 (1995), 1495–1504.
  7. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023).
  8. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018).
  9. Undetectable Watermarks for Language Models. arXiv preprint arXiv:2306.09194 (2023).
  10. How to Protect Copyright Data in Optimization of Large Language Models? arXiv preprint arXiv:2308.12247 (2023).
  11. No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672 (2022).
  12. Renee DiResta. 2020. AI-generated text is the scariest deepfake of all. Wired, Jul (2020).
  13. Alpacafarm: A simulation framework for methods that learn from human feedback. arXiv preprint arXiv:2305.14387 (2023).
  14. Publicly Detectable Watermarking for Language Models. Cryptology ePrint Archive (2023).
  15. Publicly Detectable Watermarking for Language Models. Cryptology ePrint Archive, Paper 2023/1661. https://eprint.iacr.org/2023/1661 https://eprint.iacr.org/2023/1661.
  16. ELI5: Long form question answering. arXiv preprint arXiv:1907.09190 (2019).
  17. Christiane Fellbaum. 1998. WordNet: An electronic lexical database. MIT press.
  18. Evgeniy Gabrilovich and Alex Gontmakher. 2002. The homograph attack. Commun. ACM 45, 2 (2002), 128.
  19. Riley Goodside. 2023. There are adversarial attacks for that proposal as well — in particular, generating with emojis after words and then removing them before submitting defeats it. https://twitter.com/goodside/status/1610682909647671306.
  20. Protecting intellectual property of language generation apis with lexical watermark. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10758–10766.
  21. Cater: Intellectual property protection on text generation apis via conditional watermarks. Advances in Neural Information Processing Systems 35 (2022), 5431–5445.
  22. How good are gpt models at machine translation? a comprehensive evaluation. arXiv preprint arXiv:2302.09210 (2023).
  23. Teaching machines to read and comprehend. Advances in neural information processing systems 28 (2015).
  24. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019).
  25. Unbiased Watermark for Large Language Models. arXiv preprint arXiv:2310.10669 (2023).
  26. Vojtěch Hudeček and Ondřej Dušek. 2023. Are LLMs All You Need for Task-Oriented Dialogue? arXiv preprint arXiv:2304.06556 (2023).
  27. A robust digital watermarking algorithm for text document copyright protection based on feature coding. In 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC). IEEE, 1940–1945.
  28. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).
  29. A review of text watermarking: theory, methods, and applications. IEEE Access 6 (2018), 8011–8028.
  30. A Watermark for Large Language Models. arXiv:2301.10226 [cs.LG]
  31. A Watermark for Large Language Models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.). PMLR, 17061–17084. https://proceedings.mlr.press/v202/kirchenbauer23a.html
  32. On the Reliability of Watermarks for Large Language Models. arXiv preprint arXiv:2306.04634 (2023).
  33. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. arXiv preprint arXiv:2303.13408 (2023).
  34. Robust distortion-free watermarks for language models. arXiv preprint arXiv:2307.15593 (2023).
  35. Spoc: Search-based pseudocode to code. Advances in Neural Information Processing Systems 32 (2019).
  36. Who Wrote this Code? Watermarking for Code Generation. arXiv preprint arXiv:2305.15060 (2023).
  37. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
  38. Text Revision by On-the-Fly Representation Optimization. (2022), 10956–10964.
  39. Unsupervised text generation by learning from search. Advances in Neural Information Processing Systems 33 (2020), 10820–10831.
  40. A novel watermarking framework for intellectual property protection of NLG APIs. Neurocomputing 558 (2023), 126700.
  41. Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://aclanthology.org/W04-1013
  42. An Unforgeable Publicly Verifiable Watermark for Large Language Models. arXiv:2307.16230 [cs.CL]
  43. A Semantic Invariant Robust Watermark for Large Language Models. arXiv preprint arXiv:2310.06356 (2023).
  44. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics 8 (2020), 726–742.
  45. Watermarking Text Data on Large Language Models for Dataset Copyright Protection. arXiv preprint arXiv:2305.13257 (2023).
  46. Www’18 open challenge: financial opinion mining and question answering. In Companion proceedings of the the web conference 2018. 1941–1942.
  47. Architecture of a fake news detection system combining digital watermarking, signal processing, and machine learning. Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications 13, 1 (2022), 33–55.
  48. DISSIMILAR: Towards Fake News Detection Using Information Hiding. In Signal Processing and Machine Learning. In The 16th International Conference on Availability, Reliability and Security (Vienna, Austria)(ARES 2021). Association for Computing Machinery, New York, NY, USA, Article, Vol. 66.
  49. Natural language watermarking via morphosyntactic alterations. Computer Speech & Language 23, 1 (2009), 107–125.
  50. PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model. arXiv preprint arXiv:2203.17090 (2022).
  51. Nighat Mir. 2014. Copyright for web content using invisible text watermarking. Computers in Human Behavior 30 (2014), 648–653.
  52. Detectgpt: Zero-shot machine-generated text detection using probability curvature. arXiv preprint arXiv:2301.11305 (2023).
  53. A corpus and cloze evaluation for deeper understanding of commonsense stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 839–849.
  54. Travis Munyer and Xin Zhong. 2023. Deeptextmark: Deep learning based text watermarking for detection of large language model generated text. arXiv preprint arXiv:2305.05773 (2023).
  55. Lever: Learning to verify language-to-code generation with execution. In International Conference on Machine Learning. PMLR, 26106–26128.
  56. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022).
  57. OpenAI. 2023. GPT-4 Technical Report. ArXiv abs/2303.08774 (2023). https://api.semanticscholar.org/CorpusID:257532815
  58. COPEN: Probing conceptual knowledge in pre-trained language models. arXiv preprint arXiv:2211.04079 (2022).
  59. Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark. arXiv preprint arXiv:2305.10036 (2023).
  60. Mike Perkins. 2023. Academic Integrity considerations of AI Large Language Models in the post-pandemic era: ChatGPT and beyond. Journal of University Teaching & Learning Practice 20, 2 (2023), 07.
  61. UniSpaCh: A text-based data hiding method using Unicode space characters. Journal of Systems and Software 85, 5 (2012), 1075–1082. https://doi.org/10.1016/j.jss.2011.12.023
  62. Improving language understanding by generative pre-training. (2018).
  63. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  64. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  65. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).
  66. A Robust Semantics-based Watermark for Large Language Model against Paraphrasing. arXiv preprint arXiv:2311.08721 (2023).
  67. Content-preserving text watermarking through unicode homoglyph substitution. In Proceedings of the 20th International Database Engineering & Applications Symposium. 97–104.
  68. Can AI-Generated Text be Reliably Detected? arXiv:2303.11156 [cs.CL]
  69. Embarrassingly Simple Text Watermarks. arXiv preprint arXiv:2310.08920 (2023).
  70. Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. arXiv preprint arXiv:2203.13224 (2022).
  71. CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models. arXiv preprint arXiv:2308.14401 (2023).
  72. Coprotector: Protect open-source code against unauthorized training usage with data poisoning. In Proceedings of the ACM Web Conference 2022. 652–660.
  73. Necessary and sufficient watermark for large language models. arXiv preprint arXiv:2310.00833 (2023).
  74. A comparative analysis of information hiding techniques for copyright protection of text documents. Security and Communication Networks 2018 (2018).
  75. Secure Your Model: A Simple but Effective Key Prompt Protection Mechanism for Large Language Models. ([n. d.]).
  76. Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking. arXiv preprint arXiv:2303.11470 (2023).
  77. Large language models in medicine. Nature medicine 29, 8 (2023), 1930–1940.
  78. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).
  79. Words Are Not Enough: Sentence Level Natural Language Watermarking. In Proceedings of the 4th ACM International Workshop on Contents Protection and Security (Santa Barbara, California, USA) (MCPS ’06). Association for Computing Machinery, New York, NY, USA, 37–46. https://doi.org/10.1145/1178766.1178777
  80. The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In Proceedings of the 8th workshop on Multimedia and security. 164–174.
  81. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  82. WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models. arXiv preprint arXiv:2311.07138 (2023).
  83. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models. In Chi conference on human factors in computing systems extended abstracts. 1–7.
  84. HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis. arXiv preprint arXiv:2305.18226 (2023).
  85. WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data. arXiv preprint arXiv:2310.00646 (2023).
  86. Towards codable text watermarking for large language models. arXiv preprint arXiv:2307.15992 (2023).
  87. DiPmark: A Stealthy, Efficient and Resilient Watermark for Large Language Models. arXiv preprint arXiv:2310.07710 (2023).
  88. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 1–10.
  89. Watermarking Text Generated by Black-Box Language Models. arXiv preprint arXiv:2305.08883 (2023).
  90. Tracing text provenance via context-aware lexical substitution. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11613–11621.
  91. HotpotQA: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018).
  92. Robust multi-bit natural language watermarking through invariant features. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2092–2115.
  93. Advancing Beyond Identification: Multi-bit Watermark for Language Models. arXiv preprint arXiv:2308.00221 (2023).
  94. KoLA: Carefully Benchmarking World Knowledge of Large Language Models. arXiv preprint arXiv:2306.09296 (2023).
  95. REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models. arXiv preprint arXiv:2310.12362 (2023).
  96. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
  97. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675 (2019).
  98. Benchmarking large language models for news summarization. arXiv preprint arXiv:2301.13848 (2023).
  99. Provable robust watermarking for ai-generated text. arXiv preprint arXiv:2306.17439 (2023).
  100. Distillation-resistant watermarking for model protection in nlp. arXiv preprint arXiv:2210.03312 (2022).
  101. Protecting language generation models via invisible watermarking. arXiv preprint arXiv:2302.03162 (2023).
  102. Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv preprint arXiv:2306.05685 (2023).
  103. Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (<conf-loc>, <city>Hamburg</city>, <country>Germany</country>, </conf-loc>) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 436, 20 pages. https://doi.org/10.1145/3544548.3581318
  104. Incorporating bert into neural machine translation. arXiv preprint arXiv:2002.06823 (2020).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Aiwei Liu (42 papers)
  2. Leyi Pan (7 papers)
  3. Yijian Lu (5 papers)
  4. Jingjing Li (98 papers)
  5. Xuming Hu (120 papers)
  6. Lijie Wen (58 papers)
  7. Irwin King (170 papers)
  8. Philip S. Yu (592 papers)
  9. Xi Zhang (302 papers)
  10. Hui Xiong (244 papers)
Citations (33)

Summary

Overview of Text Watermarking

The field of natural language processing has witnessed remarkable advancements with the rise of LLMs. As these models become more capable of generating high-quality text, concerns over the spread of misinformation, intellectual property rights, and academic integrity are growing. Text watermarking technology offers a potential solution to these challenges by embedding detectable patterns in generated texts that are difficult for humans to notice but easily identifiable by algorithms. This technology can help trace the origin of texts, discourage misuse, and curb content piracy.

Techniques and Comparisons

Text watermarking techniques can be broadly categorized based on their execution approach. Some methods seek to watermark pre-existing text while others focus on integrating watermarks during the generation process by LLMs.

Watermarking for Existing Text

Here, the watermark is added to already-generated text. It encompasses:

  • Format-based Watermarking: Involves subtle format modifications such as whitespace manipulation or Unicode substitutions without altering content.
  • Lexical-based Watermarking: Employs synonym replacements to embed watermarks while preserving meaning.
  • Syntactic-based Watermarking: Uses syntax transformations that slightly modify sentence structures to carry the watermark.
  • Generation-based Watermarking: Engages pretrained LLMs to generate watermarked texts end-to-end by combining original text with watermarks.

Watermarking for LLMs

This approach alters the process by which LLMs themselves produce text:

  • Training Time Watermarking: Embeds watermarks into a dataset which, upon being used to train an LLM, results in the LLM's output containing watermarks.
  • Watermarking During Logits Generation: Adjusts the probability distribution of words during the text generation process of an LLM.
  • Watermarking During Token Sampling: Imposes randomness in the token selection phase post-logits generation to include watermarks.

Evaluation Perspectives

To assess the efficacy of watermarking technologies, researchers consider four main evaluation perspectives:

  • Success Rate: Measures how frequently and accurately watermarks are detected, including zero-bit (presence/absence) and multi-bit (detailed info) watermarking.
  • Text Quality: Evaluates the influence of watermarking on the text’s fluency, consistency, and overall quality, often through perplexity and semantic scores.
  • Robustness: Determines the persistence of watermarks following intentional modifications aimed at watermark removal attacks.
  • Unforgeability: Assesses the difficulty in replicating or forging watermarks by unauthorized third parties.

Applications and Implications

Text watermarking plays a pivotal role in several real-world domains:

  • Copyright Protection: Safeguards intellectual property by marking ownership of texts and LLM-generated datasets to prevent unauthorized duplication and training use.
  • Academic Integrity: Helps educational institutions distinguish LLM-generated submissions from student-originated work, upholding standards of academic honesty.
  • Fake News Detection: Assists in identifying and tracing the origins of AI-generated misinformation to preserve the authenticity of online content.

Conclusions

Text watermarking emerges as an indispensable technology in maintaining content integrity in the age of AI-generated text. As LLMs continue to evolve, watermarking techniques must adapt to new challenges, ensuring a balance between robustness, payload, and minimal impact on text quality. The pursuit of unforgeability, especially in the face of sophisticated potential attacks, remains imperative. Moreover, the expanding applications hold significant promise for protecting intellectual property, fostering academic honesty, and countering the spread of misinformation.

Youtube Logo Streamline Icon: https://streamlinehq.com