WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off (2403.04808v3)
Abstract: Watermarking is a technical means to dissuade malfeasant usage of LLMs. This paper proposes a novel watermarking scheme, so-called WaterMax, that enjoys high detectability while sustaining the quality of the generated text of the original LLM. Its new design leaves the LLM untouched (no modification of the weights, logits, temperature, or sampling technique). WaterMax balances robustness and complexity contrary to the watermarking techniques of the literature inherently provoking a trade-off between quality and robustness. Its performance is both theoretically proven and experimentally validated. It outperforms all the SotA techniques under the most complete benchmark suite. Code available at https://github.com/eva-giboulot/WaterMax.
- Watermarking GPT outputs, 2023. URL https://scottaaronson.blog/?m=202302.
- CAC. Cyberspace Administration of China: Provisions on the administration of deep synthesis internet information services, 2022. URL https://www.chinalawtranslate.com/en/deep-synthesis/.
- PPL-MCTS: constrained textual generation through discriminator-guided MCTS decoding. In Carpuat, M., de Marneffe, M., and RuÃz, I. V. M. (eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pp. 2953–2967. Association for Computational Linguistics, 2022. URL https://aclanthology.org/2022.naacl-main.215.
- Undetectable watermarks for language models. Cryptology ePrint Archive, Paper 2023/763, 2023. URL https://eprint.iacr.org/2023/763.
- EUParliament. EU AI Act: first regulation on artificial intelligence, 2023. URL https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence.
- Three bricks to consolidate watermarks for Large Language Models. 2023 IEEE International Workshop on Information Forensics and Security (WIFS), 2023.
- Spotting LLMs with binoculars: Zero-shot detection of machine-generated text, 2024.
- Towards optimal statistical watermarking, 2023.
- Temperature matters: Enhancing watermark robustness against paraphrasing attacks. 2023.
- A watermark for Large Language Models. arXiv preprint arXiv:2301.10226, 2023a.
- On the reliability of watermarks for Large Language Models, 2023b.
- Robust distortion-free watermarks for language models. arXiv preprint arXiv:2307.15593, 2023.
- Adaptive text watermark for Large Language Models, 2024.
- DetectGPT: zero-shot machine-generated text detection using probability curvature, 2023.
- NIST. National Institute of Standards and Technology: Calls for information to support safe, secure and trustworthy development and use of artificial intelligence, 2023. URL https://www.nist.gov/news-events/news/2023/12/nist-calls-information-support-safe-secure-and-trustworthy-development-and.
- Mark My Words: analyzing and evaluating language model watermarks. arXiv preprint arXiv:2312.00273, 2023.
- Mauve: Measuring the gap between neural text and human text using divergence frontiers. Advances in Neural Information Processing Systems, 34:4816–4828, 2021.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2):260–269, 1967. doi: 10.1109/TIT.1967.1054010.
- LLMdet: a third party Large Language Models generated text detection tool. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 2113–2133, 2023.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.