Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Advancing Beyond Identification: Multi-bit Watermark for Large Language Models (2308.00221v3)

Published 1 Aug 2023 in cs.CL, cs.AI, and cs.CR

Abstract: We show the viability of tackling misuses of LLMs beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during LLM generation. Through allocating tokens onto different parts of the messages, we embed longer messages in high corruption settings without added latency. By independently embedding sub-units of messages, the proposed method outperforms the existing works in terms of robustness and latency. Leveraging the benefits of zero-bit watermarking, our method enables robust extraction of the watermark without any model access, embedding and extraction of long messages ($\geq$ 32-bit) without finetuning, and maintaining text quality, while allowing zero-bit detection all at the same time. Code is released here: https://github.com/bangawayoo/mb-lm-watermarking

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Watermarking gpt outputs. https://www.scottaaronson.com/talks/watermark.ppt, 2023. Accessed: 2023-09-14.
  2. Adversarial watermarking transformer: Towards tracing text provenance with data hiding. In 2021 IEEE Symposium on Security and Privacy (SP), pp. 121–140. IEEE, 2021.
  3. Palmer Annie. People are using a.i. chatbots to write amazon reviews. CNBC, 2023. URL https://www.cnbc.com/2023/04/25/amazon-reviews-are-being-written-by-ai-chatbots.html.
  4. Md Asikuzzaman and Mark R Pickering. An overview of digital video watermarking. IEEE Transactions on Circuits and Systems for Video Technology, 28(9):2131–2153, 2017.
  5. Natural language watermarking: Design, analysis, and a proof-of-concept implementation. In International Workshop on Information Hiding, pp. 185–200. Springer, 2001.
  6. Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp.  258–265, 2018. doi: 10.1109/ASONAM.2018.8508646.
  7. Elwyn R Berlekamp. Block coding with noiseless feedback. PhD thesis, Massachusetts Institute of Technology, 1964.
  8. Digital image steganography: Survey and analysis of current methods. Signal processing, 90(3):727–752, 2010.
  9. Undetectable watermarks for language models. arXiv preprint arXiv:2306.09194, 2023.
  10. Thomas M Cover. Elements of information theory. John Wiley & Sons, 1999.
  11. Peter Elias. Error-correcting codes for list decoding. IEEE Transactions on Information Theory, 37(1):5–12, 1991.
  12. Generating steganographic text with lstms. In Proceedings of ACL 2017, Student Research Workshop, pp. 100–106, 2017.
  13. Three bricks to consolidate watermarks for large language models. arXiv preprint arXiv:2308.00113, 2023a.
  14. The stable signature: Rooting watermarks in latent diffusion models. arXiv preprint arXiv:2303.15435, 2023b.
  15. Philip Gage. A new algorithm for data compression. C Users Journal, 12(2):23–38, 1994.
  16. The Open Group. The open group base specifications issue 7, 2018 edition ieee std 1003.1™-2017 (revision of ieee std 1003.1-2008) copyright © 2001 2018 ieee and the open group. https://pubs.opengroup.org/onlinepubs/9699919799/, 2018. Accessed: 2023-09-14.
  17. Binary error-correcting codes with minimal noiseless feedback. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pp.  1475–1487, 2023.
  18. Venkatesan Guruswami. List decoding of error-correcting codes: winning thesis of the 2002 ACM doctoral dissertation competition, volume 3282. Springer Science & Business Media, 2004.
  19. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Transactions on information theory, 54(1):135–150, 2008.
  20. Protecting intellectual property of language generation apis with lexical watermark. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  10758–10766, 2022.
  21. Twenty years of digital audio watermarking—a comprehensive review. Signal processing, 128:222–242, 2016.
  22. A watermark for large language models. arXiv preprint arXiv:2301.10226, 2023a.
  23. On the reliability of watermarks for large language models. arXiv preprint arXiv:2306.04634, 2023b.
  24. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. arXiv preprint arXiv:2303.13408, 2023.
  25. Who wrote this code? watermarking for code generation. arXiv preprint arXiv:2305.15060, 2023.
  26. Bruce Levin. A representation for multinomial cumulative distribution functions. The Annals of Statistics, pp.  1123–1126, 1981.
  27. Distortion agnostic deep watermarking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  13548–13557, 2020.
  28. Pointer sentinel mixture models, 2016.
  29. Detectgpt: Zero-shot machine-generated text detection using probability curvature. arXiv preprint arXiv:2301.11305, 2023.
  30. Deeptextmark: Deep learning based text watermarking for detection of large language model generated text. arXiv preprint arXiv:2305.05773, 2023.
  31. Propaganda and misinformation on facebook and twitter during the russian invasion of ukraine. In Proceedings of the 15th ACM Web Science Conference 2023, pp.  65–74, 2023.
  32. A survey of digital image watermarking techniques. In INDIN’05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005., pp.  709–716. IEEE, 2005.
  33. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  34. Christoph Schuhmann. Huggingface datasets: Christophschuhmann/essays-with-instructions. https://huggingface.co/datasets/ChristophSchuhmann/essays-with-instructions, 2022. Accessed: 2023-09-14.
  35. Claude Elwood Shannon. A mathematical theory of communication. The Bell system technical journal, 27(3):379–423, 1948.
  36. Robust image watermarking theories and techniques: A review. Journal of applied research and technology, 12(1):122–138, 2014.
  37. Natural language watermarking. In Security, Steganography, and Watermarking of Multimedia Contents VII, volume 5681, pp.  441–452. SPIE, 2005.
  38. Natural language watermarking: Challenges in building a practical system. In Security, Steganography, and Watermarking of Multimedia Contents VIII, volume 6072, pp.  106–117. SPIE, 2006a.
  39. The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In Proceedings of the 8th workshop on Multimedia and security, pp.  164–174, 2006b.
  40. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  41. A downward spiral? a panel study of misinformation and media trust in chile. The International Journal of Press/Politics, 27(2):353–373, 2022.
  42. Towards codable text watermarking for large language models. arXiv preprint arXiv:2307.15992, 2023a.
  43. M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection. arXiv preprint arXiv:2305.14902, 2023b.
  44. Reed-Solomon codes and their applications. John Wiley & Sons, 1999.
  45. Paraphrastic representations at scale. In Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp.  379–388, 2022.
  46. Tracing text provenance via context-aware lexical substitution. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  11613–11621, 2022.
  47. Robust multi-bit natural language watermarking through invariant features. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  2092–2115, 2023.
  48. A recipe for watermarking diffusion models. arXiv preprint arXiv:2303.10137, 2023.
  49. Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV), pp.  657–672, 2018.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com