Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality (2407.13803v1)

Published 17 Jul 2024 in cs.CR, cs.AI, and cs.CL

Abstract: With the widespread adoption of LLMs, concerns about potential misuse have emerged. To this end, watermarking has been adapted to LLM, enabling a simple and effective way to detect and monitor generated text. However, while the existing methods can differentiate between watermarked and unwatermarked text with high accuracy, they often face a trade-off between the quality of the generated text and the effectiveness of the watermarking process. In this work, we present a novel type of LLM watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text. The key strategy involves anchoring watermarked tokens to words that have specific Part-of-Speech (POS) tags. Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous LLM watermarking methods in quality across various tasks

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Duy C. Hoang (3 papers)
  2. Hung T. Q. Le (1 paper)
  3. Rui Chu (2 papers)
  4. Ping Li (421 papers)
  5. Weijie Zhao (44 papers)
  6. Yingjie Lao (22 papers)
  7. Khoa D. Doan (36 papers)