Papers
Topics
Authors
Recent
Search
2000 character limit reached

Left-Right Diffusion Watermarking

Updated 25 January 2026
  • Left-Right Diffusion Watermarking (LR-DWM) is a framework for diffusion language models that embeds watermark signals using both left and right neighbor tokens.
  • It employs cryptographic hash functions and independent logit biasing to inject order-agnostic watermarks with minimal computational and memory overhead.
  • Experimental evaluations demonstrate high detection rates with low perplexity impact and robust performance against typical deletion and substitution attacks.

Left-Right Diffusion Watermarking (LR-DWM) is a watermarking framework designed specifically for Diffusion LLMs (DLMs), enabling robust statistical detection and attribution of AI-generated text with minimal computational and memory overhead. Unlike previous watermarking systems adapted to autoregressive (AR) LLMs, LR-DWM is tailored for the non-sequential, iterative denoising process of DLMs and leverages locally-available token context, notably both left and right neighbors, to inject order-agnostic watermark signals (Raban et al., 18 Jan 2026, &&&1&&&).

1. Motivation and Technical Foundations

Autoregressive watermarking mechanisms rely on the left-to-right, sequential generation characteristic of AR LLMs—for each token position tt, a hash of the entire preceding context y<ty_{<t} determines a “green set” GtG_t (a balanced subset of the vocabulary), whose members are promoted via logit biasing during generation. However, DLMs generate text by iteratively refining tokens in a parallel or arbitrary position order, so many neighbors remain undecided at any update step, rendering AR-style hashing inapplicable without extensive architectural changes or process inversion (Raban et al., 18 Jan 2026, Gloaguen et al., 29 Sep 2025).

The key insight of LR-DWM is to use each finalized neighbor—as soon as it becomes available during denoising—as a local anchor for watermark constraints. Each token position ii applies independent logit biases according to the hash-derived green sets formed by the revealed left (L=yi1L = y_{i-1}) and right (R=yi+1R = y_{i+1}) neighbors. These additive constraints allow effective watermarking regardless of generation order and without dependence on central memory or expectation steps.

2. Mathematical Formulation

Let y=(y1,,yT)y = (y_1, \dots, y_T) denote the token sequence and V\mathcal{V} the vocabulary. DLMs proceed over SS denoising steps; at step ss, a subset Is{1,,T}I_s \subseteq \{1, \dots, T\} of positions are selected for update. For each iIsi \in I_s, the unwatermarked logits l=l(y(s),i)RVl = l(y^{(s)}, i) \in \mathbb{R}^{|\mathcal{V}|} define the conditional distribution:

pθ(yi=xy(s),i)exp(lx)p_\theta(y_i = x \mid y^{(s)}, i) \propto \exp(l_x)

If yi1y_{i-1} or yi+1y_{i+1} is finalized, cryptographic hash functions parameterized by secret keys kLk_L and kRk_R generate green sets:

GL(i)=G(yi1;kL)GR(i)=G(yi+1;kR)G_L^{(i)} = \mathcal{G}(y_{i-1}; k_L) \qquad G_R^{(i)} = \mathcal{G}(y_{i+1}; k_R)

Watermark insertion occurs by adding bias parameters λL\lambda_L and λR\lambda_R (typically both set to a strength hyperparameter δ\delta) to logits indexed by green-set membership:

lv=lv+λLI[vGL(i)]+λRI[vGR(i)]l'_v = l_v + \lambda_L \cdot \mathbb{I}[v \in G_L^{(i)}] + \lambda_R \cdot \mathbb{I}[v \in G_R^{(i)}]

The watermarked sampling distribution thus becomes:

p~θ(yi=vy(s),i)pθ(yi=vy(s),i)exp(λLfL(v,L)+λRfR(v,R))\tilde{p}_\theta(y_i = v \mid y^{(s)}, i) \propto p_\theta(y_i = v \mid y^{(s)}, i) \cdot \exp\left(\lambda_L f_L(v, L) + \lambda_R f_R(v, R)\right)

where fL(v,L)f_L(v, L) and fR(v,R)f_R(v, R) are indicator functions for set membership.

Statistical detectability is quantified post-generation. Each token ii yields match indicators mL(i),mR(i){0,1}m_L(i), m_R(i) \in \{0, 1\} and a ternary score:

si=mL(i)+mR(i)1{1,0,+1}s_i = m_L(i) + m_R(i) - 1 \in \{-1, 0, +1\}

Aggregating over the sequence and normalizing by empirically calibrated variance σ2\sigma^2 from human texts gives the detection statistic:

Z=i=1TsiσTZ = \frac{\sum_{i=1}^T s_i}{\sigma \sqrt{T}}

Under the null hypothesis H0H_0 (unwatermarked/human text), ZZ is approximately standard normal, permitting precise control of false positive rates.

3. Algorithmic Workflow and Implementation

LR-DWM is realized with minimal overhead and does not require caching or process inversion. The pseudocode below summarizes the insertion process in DLM decoding (Raban et al., 18 Jan 2026).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Input: prompt c, diffusion model M, secret keys k_L, k_R, bias δ
Output: generated sequence y

Initialize y^{(S)} by corrupting the prompt-conditioned token sequence
for s = S downto 1:
    I_s  positions to update
    for i in I_s:
        l  M.logits(y^{(s)}, i)
        if y^{(s)}_{i-1} revealed:
            G_L  G(y_{i-1}, k_L)
            l[v in G_L] += δ
        if y^{(s)}_{i+1} revealed:
            G_R  G(y_{i+1}, k_R)
            l[v in G_R] += δ
        y_i  Softmax(l) or deterministic decode
    y^{(s-1)}  updated sequence
return y^{(0)}
Computation of green sets via fast hashing is O(1)O(1) per neighbor, logit biasing is O(V)O(|\mathcal{V}|) per update. At boundaries (positions i=1i = 1 or i=Ti = T), only one-sided biasing applies. Total runtime and peak memory overhead remain within 1–2% of an unwatermarked DLM.

4. Detection, Guarantees, and Analytical Properties

Under H0H_0, neighbor-actuated green sets for each token position are indistinguishable from random assignment; matched indicators thus follow independent Bernoulli(12)(\frac{1}{2}) distributions, and ZZ is calibrated for controlled FPR (e.g., Z>2.33Z > 2.33 yields 1%1\% FPR). Under watermarking hypothesis H1H_1, each available neighbor increases match probability by an amount linear in δ\delta. For small δ\delta, analytical expressions predict the Z-statistic mean shift and statistical power:

EH1[si]λL+λR2(1+eδ)E_{H_1}[s_i] \approx \frac{\lambda_L + \lambda_R}{2(1 + e^{-\delta})}

Complexity analysis shows insertion and detection steps are computationally efficient. LR-DWM's complexity is O(V)O(|\mathcal{V}|) per updated position, with no need for V2|\mathcal{V}|^2 lookup tables or expectation steps required by prior DLM watermarking variants (Raban et al., 18 Jan 2026). Practical watermark detectors apply Z-thresholding on sequences of few hundred tokens.

5. Experimental Evaluation

Experiments conducted on LLaDA-8B-Instruct and DREAM-7B-Instruct using the WaterBench dataset (600 prompts, sequences of length T=300T=300, S=300S=300 steps) demonstrate LR-DWM achieves high watermark detection rates with negligible quality impact (Raban et al., 18 Jan 2026). Performance metrics include:

  • Wall-clock and memory overhead: LR-DWM nearly identical to baseline; WM-DLM incurs 20%\approx 20\% overhead, DMARK doubles GPU memory.
  • Perplexity (PPL\textrm{PPL}) and Detectability Trade-off:
Detection Rate WM-DLM PPL±SEM DMARK LR-DWM
90% 5.07±1.40 2.82±0.51 2.80±0.46
99% 6.14±1.84 3.28±0.61 3.32±0.65
99.5% 6.33±1.90 3.34±0.63 3.37±0.66

LR-DWM matches DMARK's detection-vs-quality profile and surpasses WM-DLM for low PPL at high detection rates.

  • Robustness: LR-DWM achieves 98.8% detection under 10% random deletion, 99.4% under 10% BERT-based substitution, but only 15.8% under paraphrasing, a limitation common to local-context watermarks.

Additional trials with sequence lengths 50\geq 50 show detection rates >99%>99\% at 1%1\% FPR, with log-perplexity increases under $0.05$ and expert judgment showing unchanged style, accuracy, and ethics (Gloaguen et al., 29 Sep 2025).

6. Limitations, Context, and Future Directions

LR-DWM exhibits a fundamental trade-off between watermark strength (δ\delta) and fluency as measured by perplexity. Very large δ\delta values may be unsuitable for sensitive applications that demand high output quality. Reliable statistical detection is achieved on text strings of a few hundred tokens or more.

Robustness to paraphrase remains an open challenge, as local context constraints are disrupted by extensive rephrasing. The method is resilient to typical deletion and substitution attacks, but not global rewrites. Ongoing research seeks to integrate wider context (e.g., multi-token anchors) and adaptive strength parameters based on position-level confidence. Extensions to more syntax-aware or global watermark designs are proposed.

LR-DWM constitutes an efficient, two-sided watermarking method for DLMs by exploiting whichever neighbor tokens are available during generation. The scheme achieves competitive detection-vs-quality trade-offs and preserves both computational and memory efficiency relative to non-watermarked baselines, marking a substantial advance in the domain of diffusion-model watermarking (Raban et al., 18 Jan 2026, Gloaguen et al., 29 Sep 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Left-Right Diffusion Watermarking (LR-DWM).