Left-Right Diffusion Watermarking

Updated 25 January 2026

Left-Right Diffusion Watermarking (LR-DWM) is a framework for diffusion language models that embeds watermark signals using both left and right neighbor tokens.
It employs cryptographic hash functions and independent logit biasing to inject order-agnostic watermarks with minimal computational and memory overhead.
Experimental evaluations demonstrate high detection rates with low perplexity impact and robust performance against typical deletion and substitution attacks.

Left-Right Diffusion Watermarking (LR-DWM) is a watermarking framework designed specifically for Diffusion LLMs (DLMs), enabling robust statistical detection and attribution of AI-generated text with minimal computational and memory overhead. Unlike previous watermarking systems adapted to autoregressive (AR) LLMs, LR-DWM is tailored for the non-sequential, iterative denoising process of DLMs and leverages locally-available token context, notably both left and right neighbors, to inject order-agnostic watermark signals (Raban et al., 18 Jan 2026, &&&1&&&).

1. Motivation and Technical Foundations

Autoregressive watermarking mechanisms rely on the left-to-right, sequential generation characteristic of AR LLMs—for each token position $t$ , a hash of the entire preceding context $y_{<t}$ determines a “green set” $G_t$ (a balanced subset of the vocabulary), whose members are promoted via logit biasing during generation. However, DLMs generate text by iteratively refining tokens in a parallel or arbitrary position order, so many neighbors remain undecided at any update step, rendering AR-style hashing inapplicable without extensive architectural changes or process inversion (Raban et al., 18 Jan 2026, Gloaguen et al., 29 Sep 2025).

The key insight of LR-DWM is to use each finalized neighbor—as soon as it becomes available during denoising—as a local anchor for watermark constraints. Each token position $i$ applies independent logit biases according to the hash-derived green sets formed by the revealed left ( $L = y_{i-1}$ ) and right ( $R = y_{i+1}$ ) neighbors. These additive constraints allow effective watermarking regardless of generation order and without dependence on central memory or expectation steps.

2. Mathematical Formulation

Let $y = (y_1, \dots, y_T)$ denote the token sequence and $\mathcal{V}$ the vocabulary. DLMs proceed over $S$ denoising steps; at step $s$ , a subset $I_s \subseteq \{1, \dots, T\}$ of positions are selected for update. For each $i \in I_s$ , the unwatermarked logits $l = l(y^{(s)}, i) \in \mathbb{R}^{|\mathcal{V}|}$ define the conditional distribution:

$p_\theta(y_i = x \mid y^{(s)}, i) \propto \exp(l_x)$

If $y_{i-1}$ or $y_{i+1}$ is finalized, cryptographic hash functions parameterized by secret keys $k_L$ and $k_R$ generate green sets:

$G_L^{(i)} = \mathcal{G}(y_{i-1}; k_L) \qquad G_R^{(i)} = \mathcal{G}(y_{i+1}; k_R)$

Watermark insertion occurs by adding bias parameters $\lambda_L$ and $\lambda_R$ (typically both set to a strength hyperparameter $\delta$ ) to logits indexed by green-set membership:

$l'_v = l_v + \lambda_L \cdot \mathbb{I}[v \in G_L^{(i)}] + \lambda_R \cdot \mathbb{I}[v \in G_R^{(i)}]$

The watermarked sampling distribution thus becomes:

$\tilde{p}_\theta(y_i = v \mid y^{(s)}, i) \propto p_\theta(y_i = v \mid y^{(s)}, i) \cdot \exp\left(\lambda_L f_L(v, L) + \lambda_R f_R(v, R)\right)$

where $f_L(v, L)$ and $f_R(v, R)$ are indicator functions for set membership.

Statistical detectability is quantified post-generation. Each token $i$ yields match indicators $m_L(i), m_R(i) \in \{0, 1\}$ and a ternary score:

$s_i = m_L(i) + m_R(i) - 1 \in \{-1, 0, +1\}$

Aggregating over the sequence and normalizing by empirically calibrated variance $\sigma^2$ from human texts gives the detection statistic:

$Z = \frac{\sum_{i=1}^T s_i}{\sigma \sqrt{T}}$

Under the null hypothesis $H_0$ (unwatermarked/human text), $Z$ is approximately standard normal, permitting precise control of false positive rates.

3. Algorithmic Workflow and Implementation

LR-DWM is realized with minimal overhead and does not require caching or process inversion. The pseudocode below summarizes the insertion process in DLM decoding (Raban et al., 18 Jan 2026).

Input: prompt c, diffusion model M, secret keys k_L, k_R, bias δ
Output: generated sequence y

Initialize y^{(S)} by corrupting the prompt-conditioned token sequence
for s = S downto 1:
    I_s ← positions to update
    for i in I_s:
        l ← M.logits(y^{(s)}, i)
        if y^{(s)}_{i-1} revealed:
            G_L ← G(y_{i-1}, k_L)
            l[v in G_L] += δ
        if y^{(s)}_{i+1} revealed:
            G_R ← G(y_{i+1}, k_R)
            l[v in G_R] += δ
        y_i ∼ Softmax(l) or deterministic decode
    y^{(s-1)} ← updated sequence
return y^{(0)}

Computation of green sets via fast hashing is

O(1)

per neighbor, logit biasing is

O(|\mathcal{V}|)

per update. At boundaries (positions

i = 1

i = T

), only one-sided biasing applies. Total runtime and peak memory overhead remain within 1–2% of an unwatermarked DLM.

4. Detection, Guarantees, and Analytical Properties

Under $H_0$ , neighbor-actuated green sets for each token position are indistinguishable from random assignment; matched indicators thus follow independent Bernoulli $(\frac{1}{2})$ distributions, and $Z$ is calibrated for controlled FPR (e.g., $Z > 2.33$ yields $1\%$ FPR). Under watermarking hypothesis $H_1$ , each available neighbor increases match probability by an amount linear in $\delta$ . For small $\delta$ , analytical expressions predict the Z-statistic mean shift and statistical power:

$E_{H_1}[s_i] \approx \frac{\lambda_L + \lambda_R}{2(1 + e^{-\delta})}$

Complexity analysis shows insertion and detection steps are computationally efficient. LR-DWM's complexity is $O(|\mathcal{V}|)$ per updated position, with no need for $|\mathcal{V}|^2$ lookup tables or expectation steps required by prior DLM watermarking variants (Raban et al., 18 Jan 2026). Practical watermark detectors apply Z-thresholding on sequences of few hundred tokens.

5. Experimental Evaluation

Experiments conducted on LLaDA-8B-Instruct and DREAM-7B-Instruct using the WaterBench dataset (600 prompts, sequences of length $T=300$ , $S=300$ steps) demonstrate LR-DWM achieves high watermark detection rates with negligible quality impact (Raban et al., 18 Jan 2026). Performance metrics include:

Wall-clock and memory overhead: LR-DWM nearly identical to baseline; WM-DLM incurs $\approx 20\%$ overhead, DMARK doubles GPU memory.
Perplexity ( $\textrm{PPL}$ ) and Detectability Trade-off:

Detection Rate	WM-DLM PPL±SEM	DMARK	LR-DWM
90%	5.07±1.40	2.82±0.51	2.80±0.46
99%	6.14±1.84	3.28±0.61	3.32±0.65
99.5%	6.33±1.90	3.34±0.63	3.37±0.66

LR-DWM matches DMARK's detection-vs-quality profile and surpasses WM-DLM for low PPL at high detection rates.

Robustness: LR-DWM achieves 98.8% detection under 10% random deletion, 99.4% under 10% BERT-based substitution, but only 15.8% under paraphrasing, a limitation common to local-context watermarks.

Additional trials with sequence lengths $\geq 50$ show detection rates $>99\%$ at $1\%$ FPR, with log-perplexity increases under $0.05$ and expert judgment showing unchanged style, accuracy, and ethics (Gloaguen et al., 29 Sep 2025).

6. Limitations, Context, and Future Directions

LR-DWM exhibits a fundamental trade-off between watermark strength ( $\delta$ ) and fluency as measured by perplexity. Very large $\delta$ values may be unsuitable for sensitive applications that demand high output quality. Reliable statistical detection is achieved on text strings of a few hundred tokens or more.

Robustness to paraphrase remains an open challenge, as local context constraints are disrupted by extensive rephrasing. The method is resilient to typical deletion and substitution attacks, but not global rewrites. Ongoing research seeks to integrate wider context (e.g., multi-token anchors) and adaptive strength parameters based on position-level confidence. Extensions to more syntax-aware or global watermark designs are proposed.

LR-DWM constitutes an efficient, two-sided watermarking method for DLMs by exploiting whichever neighbor tokens are available during generation. The scheme achieves competitive detection-vs-quality trade-offs and preserves both computational and memory efficiency relative to non-watermarked baselines, marking a substantial advance in the domain of diffusion-model watermarking (Raban et al., 18 Jan 2026, Gloaguen et al., 29 Sep 2025).

Markdown Report Issue Upgrade to Chat

References (2)

LR-DWM: Efficient Watermarking for Diffusion Language Models (2026)

Watermarking Diffusion Language Models (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Left-Right Diffusion Watermarking (LR-DWM).