Watermarking Discrete Diffusion Language Models (2511.02083v1)
Abstract: Watermarking has emerged as a promising technique to track AI-generated content and differentiate it from authentic human creations. While prior work extensively studies watermarking for autoregressive LLMs and image diffusion models, none address discrete diffusion LLMs, which are becoming popular due to their high inference throughput. In this paper, we introduce the first watermarking method for discrete diffusion models by applying the distribution-preserving Gumbel-max trick at every diffusion step and seeding the randomness with the sequence index to enable reliable detection. We experimentally demonstrate that our scheme is reliably detectable on state-of-the-art diffusion LLMs and analytically prove that it is distortion-free with an exponentially decaying probability of false detection in the token sequence length.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.