Practical Watermark Removal Methods

Updated 1 December 2025

The paper presents a novel method that formulates watermark removal as an inverse problem, accurately restoring media while evading watermark detectors.
It utilizes deep learning frameworks such as U-Net, transformer, and GAN-enhanced architectures, achieving metrics like 44.64 dB PSNR and F1 mask scores of 0.8769.
The approach is applicable to both visible and invisible watermarks in images and audio, addressing challenges in adversarial robustness and content fidelity.

A practical watermark removal method encompasses a range of algorithmic strategies and architectural designs—most notably for images and audio—that precisely target and eliminate visible or invisible digital watermarks with minimal loss to content fidelity. Such methods range from deterministic preprocessing and classic image restoration to highly adaptive, learning-based pipelines that directly attack the watermark's underlying statistical or generative mechanism. Contemporary research (2020–2025) demonstrates that effective watermark removal is viable against both visible and invisible marks, as well as modern watermarking schemes for large vision/LLMs and diffusion-based generators.

1. Problem Formulation and Threat Models

Practical watermark removal solves an inverse problem: for an observed media object $X$ (image, audio), believed to be formed by composite $X = (1-\alpha)Y + \alpha W$ or an analogous latent-space/additive modulation, recover an output $Y$ that both (a) minimizes perceptual difference to the original, and (b) ensures that a watermark detector $\mathcal{E}(Y)$ fails, under operational detection thresholds.

Threat models range from:

White-box (full knowledge of watermark embedding/decoding);
Beige-box (algorithm known, not the key or fine-tuning specifics) (Shamshad et al., 28 Aug 2025);
Black-box (no internal knowledge, only query or statistical access).

Adversary goals and capabilities are clarified in benchmarks such as "DeAttack" (Wang et al., 24 Nov 2025), which formalize removal as an optimization problem: $\hat{I} = \arg\min_{I'} \|I' - I_w\|_2^2 + \lambda \|\mathcal{E}(I')\|_2^2$ where $I_w$ is the watermarked sample.

2. Core Approaches: Visible Watermark Removal

State-of-the-art approaches to visible watermark removal are largely deep-learning based, commonly using encoder-decoder (U-Net or Transformer) architectures with specialized masking, attention, or dual-path structures:

Two-Stage Architectures: Split detection and restoration (Split-then-Refine (Cun et al., 2020), RIRCI (Leng et al., 2023), SLBR (Liang et al., 2021)). The first stage predicts the watermark mask and a coarse background; the second stage refines restoration using the mask, often with channel or spatial attention.
Implicit Joint Decoders: Networks such as WMFormer++ (Huo et al., 2023) replace explicit multi-branch decoders with nested transformers and gated feedforward layers, enabling mutual localization-restoration.
GAN-Enhanced Methods: Some methods incorporate adversarial training and perceptual losses to enhance realism and suppress artifacts (conditional GANs (Li et al., 2019), WDNet (Liu et al., 2020)).
Blind and Self-supervised Settings: Approaches such as MorphoMod (Robinette et al., 4 Feb 2025) and PSLNet (Tian et al., 4 Mar 2024) leverage segmentation, mask dilation, and self-supervised training to remove marks without paired clean data.

Table 1. Example architectural features and datasets

Method	Backbone	Key Innovations	Dataset(s)
Split-then-Refine (Cun et al., 2020)	2×ResUNet	Multi-task, mask-guided attention	LOGO-series, MSCOCO
WDNet (Liu et al., 2020)	U-Net+ResBlocks	Decomposition, GAN, mask separation	LVW, CLWD
SLBR (Liang et al., 2021)	U-Net	Self-calibrated mask, multistage fusion	LVW, CLWD
WMFormer++ (Huo et al., 2023)	Transformer	Nested decoder, implicit gating	LOGO, CLWD
MorphoMod (Robinette et al., 4 Feb 2025)	U-Net	Morphological dilation, diffusion inpaint	CLWD, LOGO, Alpha1
PSLNet (Tian et al., 4 Mar 2024)	Dual U-Net	Self-supervised pairing, noise-robust	VOC, synthetic

These models are evaluated by PSNR, SSIM, LPIPS, and sometimes region-focused error metrics (e.g., RMSE on masked locations). For instance, WMFormer++ achieves 44.64 dB PSNR (LOGO-H) and an F1 mask score of 0.8769, surpassing earlier U-Net/ResNet-based models (Huo et al., 2023).

3. Practical Algorithms: Invisible and Latent Watermark Removal

Invisible watermark removal faces distinct challenges, as watermark signals are camouflaged in high or low spectral frequencies or embedded in diffusion-model latents:

Deep Image Prior (DIP): Black-box methods (no clean/reference data) exploit CNN’s spectral bias: DIP fits low-frequency image structure faster than noise, so early/medium optimization iterates suppress high-frequency watermarks (Liang et al., 19 Feb 2025). By monitoring outputs through a detection API, one selects the earliest “evading” iterate with maximal perceptual quality.
Single-image Latent-space PGD Attack: For diffusion watermarks, adversarially optimize a small $L_2$ (or $L_\infty$ ) perturbation $\delta$ so that the VAE encoding of $x+\delta$ “moves” outside the latent region $Z^{(w)}_0$ recognized as watermarked (Jain et al., 27 Apr 2025).
Controllable Regeneration Using Diffusion: Methods like CtrlRegen (Liu et al., 7 Oct 2024) and SADRE (Alam et al., 17 Apr 2025) apply controllable noise to the latent or image, guided by semantic and/or saliency masks, then reconstruct the image via reverse diffusion. The “start from noise” strategy ensures high watermark disruption, while control adapters preserve image content.
Two-stage Degradation + Restoration: DeAttack (Wang et al., 24 Nov 2025) and the NeurIPS 2024 challenge winner (Shamshad et al., 28 Aug 2025) destroy watermark structure using Gaussian blur, noise, JPEG artifacts, and (optionally) latent perturbation, then restore fidelity with IRNeXt/SwinIR or VAE/test-time optimization plus color correction.

Table 2. Comparison: SOTA invisible watermark removal methods

Method	Setting	Key Mechanism	ASR / TPR↓	Ref
DIP	Black-box	Untrained U-Net prior	99% (DwtDct)	(Liang et al., 19 Feb 2025)
Latent-PGD	Black-box	$L_2$ attack on VAE	99% (TreeRing)	(Jain et al., 27 Apr 2025)
DeAttack	White-box+Rest	Degrade+Restore chain	TPR $\downarrow$ 48-56%	(Wang et al., 24 Nov 2025)
CtrlRegen	Diffusion	Semantic+spatial adapters	TPR $\downarrow$ 0.01-0.12	(Liu et al., 7 Oct 2024)
NeurIPS 2024	Mixed	VAE+Diffusion+Lab corr	[email protected]%FPR 95.7%	(Shamshad et al., 28 Aug 2025)
SADRE	Saliency-guided	Noise+reverse diffusion	BRA $\downarrow$ 0.40-0.48	(Alam et al., 17 Apr 2025)

Where TPR denotes detector true-positive rate after attack, ASR denotes attack success rate, and BRA is bit-recovery accuracy.

4. Implementation, Loss Functions, and Metrics

Architectures employ multi-branch U-Nets, transformers, or ResUNets, augmented with mask prediction, attention (channel or spatial), and perceptual losses:

Structural losses: $L_1$ , VGG-perceptual, SSIM.
Mask/region losses: Cross-entropy or Dice coefficient for localization.
GAN/adversarial losses: Patch-based discriminators to enforce realism.
Task-specific: For invisible watermark evasion, detection feedback, LPIPS, CIELAB color correction, or psychoacoustic losses for audio (Li et al., 26 Nov 2025).

Optimization schedules often use Adam or AdamW, with batch sizes 4–16 and learning rates $10^{-3}$ to $10^{-5}$ . Data augmentation, mask-based losses, and, in some cases, self-supervised pair generation (as in PSLNet (Tian et al., 4 Mar 2024)) are employed when ground truth is difficult to obtain.

Benchmarks rely on standardized datasets: LVW, CLWD (color/gray watermarks), LOGO-series, Alpha1 (opaque), MS-COCO, and synthetic noisy images.

5. Applications, Limitations, and Best Practices

Application scenarios include:

Content sanitation prior to publication or dataset distribution.
Copyright circumvention (with ethical and legal implications).
Adversarial robustness analysis for watermarking schemes.
Audio: Adaptive dual-path GANs efficiently remove advanced audio watermarks, while retaining perceptual quality and cross-domain generalization (Li et al., 26 Nov 2025).

Limitations:

Opaque/low-frequency or semantic watermarks resist classical high-frequency or DIP removal. Multi-key watermarks (RingID/WIND) also exhibit greater robustness (Jain et al., 27 Apr 2025).
Fine-tuning tradeoffs: Strong removal may introduce artifacts, color drifts, or slight perceptual loss.
Dataset constraints: Supervised approaches require synthetic data or accurate ground-truth masks.
Practical deployment: Modern approaches typically require commodity GPUs (12GB+), but inference can be near real-time for U-Nets; diffusion and transformer-based methods scale higher.

Best practices:

Use adaptive mask dilation (MorphoMod (Robinette et al., 4 Feb 2025)) or saliency-driven perturbation (SADRE (Alam et al., 17 Apr 2025)) to prevent over-inpainting.
Employ multi-term hybrid losses to balance texture, structure, and perceptual fidelity.
In invisible watermark removal, run removal for a sweep of hyperparameters and verify with detector, using quality proxies (PSNR, SSIM with input) to select best results (Liang et al., 19 Feb 2025, Shamshad et al., 28 Aug 2025).

6. Landscape, Impact, and Evolution

State-of-the-art watermark removal has exposed systematic vulnerabilities in both visible and invisible watermarking, motivating more robust and certifiable embedding schemes. Many removal pipelines proposed in 2023–2025 operate entirely in a blind/black-box setting, challenging defenders to develop embedding and detection algorithms that resist degradation, generative restoration, and adversarial optimization (Wang et al., 24 Nov 2025, Jain et al., 27 Apr 2025, Alam et al., 17 Apr 2025). Ongoing research explores joint adversarial training of removal and watermark embedding, hybrid regularization, video extension, and learnable mask or improvement modules. For audio, HarmonicAttack demonstrates high cross-scheme and cross-domain transferability, enabling real-time watermark-stripping for diverse audio streams (Li et al., 26 Nov 2025).

The efficacy of practical watermark removal algorithms necessitates the continual evolution of watermarking designs, incorporating low-frequency embedding, deeper semantic coupling, adversarial resilience, and formal verification measures.