An Entropy-based Text Watermarking Detection Method (2403.13485v4)
Abstract: Text watermarking algorithms for LLMs can effectively identify machine-generated texts by embedding and detecting hidden features in the text. Although the current text watermarking algorithms perform well in most high-entropy scenarios, its performance in low-entropy scenarios still needs to be improved. In this work, we opine that the influence of token entropy should be fully considered in the watermark detection process, $i.e.$, the weight of each token during watermark detection should be customized according to its entropy, rather than setting the weights of all tokens to the same value as in previous methods. Specifically, we propose \textbf{E}ntropy-based Text \textbf{W}atermarking \textbf{D}etection (\textbf{EWD}) that gives higher-entropy tokens higher influence weights during watermark detection, so as to better reflect the degree of watermarking. Furthermore, the proposed detection process is training-free and fully automated. From the experiments, we demonstrate that our EWD can achieve better detection performance in low-entropy scenarios, and our method is also general and can be applied to texts with different entropy distributions. Our code and data is available\footnote{\url{https://github.com/luyijian3/EWD}}. Additionally, our algorithm could be accessed through MarkLLM \cite{pan2024markLLM}\footnote{\url{https://github.com/THU-BPM/MarkLLM}}.
- Program synthesis with large language models.
- Evaluating large language models trained on code.
- Undetectable watermarks for language models. arXiv preprint arXiv:2306.09194.
- Publicly detectable watermarking for language models. Cryptology ePrint Archive, Paper 2023/1661. https://eprint.iacr.org/2023/1661.
- Christiane Fellbaum. 1998. WordNet: An electronic lexical database. MIT press.
- A watermark for large language models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 17061–17084. PMLR.
- On the reliability of watermarks for large language models. arXiv preprint arXiv:2306.04634.
- Robust distortion-free watermarks for language models. arXiv preprint arXiv:2307.15593.
- Who wrote this code? watermarking for code generation. arXiv preprint arXiv:2305.15060.
- Starcoder: may the source be with you!
- An unforgeable publicly verifiable watermark for large language models.
- A semantic invariant robust watermark for large language models. arXiv preprint arXiv:2310.06356.
- A survey of text watermarking in the era of large language models.
- Dissimilar: Towards fake news detection using information hiding, signal processing and machine learning. In Proceedings of the 16th International Conference on Availability, Reliability and Security, pages 1–9.
- Travis Munyer and Xin Zhong. 2023. Deeptextmark: Deep learning based text watermarking for detection of large language model generated text. arXiv preprint arXiv:2305.05773.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Exploring the limits of transfer learning with a unified text-to-text transformer.
- A robust semantics-based watermark for large language model against paraphrasing. arXiv preprint arXiv:2311.08721.
- The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In Proceedings of the 8th workshop on Multimedia and security, pages 164–174.
- Towards codable text watermarking for large language models. arXiv preprint arXiv:2307.15992.
- Challenges in data-to-document generation.
- Text-to-table: A new way of information extraction.
- Watermarking text generated by black-box language models. arXiv preprint arXiv:2305.08883.
- Tracing text provenance via context-aware lexical substitution. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11613–11621.
- Robust multi-bit natural language watermarking through invariant features. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2092–2115.
- Advancing beyond identification: Multi-bit watermark for language models. arXiv preprint arXiv:2308.00221.
- Opt: Open pre-trained transformer language models.
- Provable robust watermarking for ai-generated text. arXiv preprint arXiv:2306.17439.