Double-Tag Technique in Watermarking & Classification

Updated 11 January 2026

Double-Tag Technique is a method that uses two independent tagging phases in both digital watermarking and data-driven classifier training to enhance robustness and detectability.
In watermarking, the approach embeds watermarks via DCT and SVD, achieving high PSNR (≈52.5 dB) even after noise attacks and ensuring reliable extraction.
For classifier training, the Tag N’ Train method iteratively refines weak taggers into strong classifiers, improving metrics such as AUC from 0.65 to 0.82 in practical experiments.

The double-tag technique refers to two distinct but structurally parallel methodologies for improving robustness and performance in two research domains: (1) digital watermarking via DCT and SVD dual-embedding, and (2) data-driven classifier training via the Tag N’ Train (TNT) bootstrapping scheme. Both approaches utilize two independent “tagging” or embedding phases—one to enhance imperceptibility and resilience, the other to leverage complementary information for inference or classification. This entry synthesizes the formal models, mathematical foundations, algorithmic steps, and experimental outcomes of these methods as established by the seminal works "A DCT And SVD based Watermarking Technique To Identify Tag" (Ji et al., 2015) and "Tag N' Train: A Technique to Train Improved Classifiers on Unlabeled Data" (Amram et al., 2020).

1. Double-Embedding for Digital Watermarking: DCT+SVD Scheme

The double-tag watermarking approach for images involves consecutive embedding of watermark information in both the frequency domain and the singular value domain. The process is comprised of two main stages:

DCT-Based Embedding: The cover image $f$ of size $M \times N$ is segmented into $8 \times 8$ blocks. For each block $f_n(x,y)$ , the 2D DCT is applied:

$X_n(u,v) = \sum_{x=0}^7 \sum_{y=0}^7 f_n(x,y)\,\cos\!\left(\frac{(2x+1)u\pi}{16}\right) \cos\!\left(\frac{(2y+1)v\pi}{16}\right)$

The watermark is embedded by adding scaled watermark bits $w_i$ to selected mid-band diagonal coefficients:

$X_n'(u_i,u_i) = X_n(u_i,u_i) + \alpha w_i,\qquad \alpha=0.05$

The inverse DCT reconstructs the intermediate watermarked image $f'$ .

SVD-Based Embedding: The intermediate image $f'$ is again blockwise decomposed, and each block $A_n$ undergoes SVD:

$A_n = U_n \Sigma_n V_n^T$

The same or a different $k$ -bit watermark is embedded into the singular values:

$\Sigma_n' = \Sigma_n + \alpha W_n$

$W_n$ is a diagonal matrix encoding the watermark bits. Each block is recomposed to obtain the final doubly watermarked image $A''$ .

This architecture ensures that the watermark information resides simultaneously in two distinct transform domains (DCT and SVD), strengthening resistance to removal or distortion in either domain alone.

2. Extraction, Performance Criteria, and Robustness

Extraction operates in reverse:

SVD-Phase Extraction: Each watermarked image block $A_n''$ undergoes SVD:

$A_n'' = \tilde{U}_n \Sigma_n'' \tilde{V}_n^T$

The embedded bits are estimated by:

$w_i = \frac{\sigma_n''^i - \sigma_n^i}{\alpha}$

where $\sigma_n^i$ are the singular values from the intermediate image $f'$ .

DCT-Phase Extraction: After SVD extraction and optional reconstruction of $f'$ , DCT is again applied blockwise to extract watermark bits from mid-band diagonal coefficients as:

$w_i = \frac{X_n''(u_i,u_i) - X_n(u_i,u_i)}{\alpha}$

Robustness is quantified using Peak Signal-to-Noise Ratio (PSNR),

$\text{PSNR} = 10 \log_{10} \left( \frac{\mathrm{MAX}_I^2}{\mathrm{MSE}} \right)$

where $\mathrm{MAX}_I=255$ for 8-bit images and MSE is mean squared error. For $\alpha=0.05$ , typical PSNR is 52.49 dB (imperceptible distortion). Even after attacks (Gaussian, salt-and-pepper noise), PSNR remains above 35 dB, and watermark recoverability is preserved. The threshold PSNR $_{\mathrm{wm}}>T$ (e.g., $T=30$ dB) is adopted for authenticity verification (Ji et al., 2015).

3. Double-Tag Bootstrapping for Data-Driven Classifier Training

The TNT (Tag N’ Train) methodology applies a two-phase tagging protocol to exploit structure in unlabeled datasets, particularly events with two correlated sub-objects, such as dijet events in collider physics. The core procedure is:

Phase 1: Weak Tagging. A weak classifier $f_1: x_i \rightarrow [0,1]$ is deployed on object $i$ in every event. Thresholds $\tau_s > \tau_b$ partition the dataset into signal-rich ( $S_1$ ) and background-rich ( $B_1$ ) samples:

$S_1 = \{k \mid f_1(x_i^k) \geq \tau_s\}, \quad B_1 = \{k \mid f_1(x_i^k) \leq \tau_b\}$

In collider settings, $f_1$ is often an autoencoder with reconstruction loss $L_{AE}(x)$ , and tags are defined by quantiles of $L_{AE}$ .

Phase 2: Strong Classifier Training. A classifier $f_2: x_j \rightarrow [0,1]$ is trained to distinguish $S_1$ (pseudo-signal, $y'=1$ ) from $B_1$ (pseudo-background, $y'=0$ ) using a weighted cross-entropy loss:

$L(f_2) = -\left[w_S \sum_{x_j \in S_1} \log f_2(x_j) + w_B \sum_{x_j \in B_1} \log(1 - f_2(x_j))\right]$

Sample weights $w_S$ and $w_B$ normalize for class imbalance.

Iterative refinement optionally alternates roles, using $f_2$ to retag object $j$ and retrain $f_1$ , with multiple cycles ( $T$ iterations). This bootstraps performance as the mixture in $S_1$ , $B_1$ becomes increasingly pure (Amram et al., 2020).

4. Theoretical and Algorithmic Foundations

In the TNT method, the mixture model analysis ensures that, provided the weak tagger satisfies $P(\text{tag}_s|\text{Signal}) \gg P(\text{tag}_s|\text{Background})$ and vice versa, the distribution $p(x_j|\text{Tagged}=S)$ is dominated by signal in $x_j$ . Thus, $f_2$ asymptotically approaches the likelihood ratio optimal for distinguishing signal from background:

$\text{Training } f_2\text{ to separate } S_1\text{ from } B_1 \Rightarrow f_2(x_j) \approx P(\text{Signal}|x_j)$

In the watermarking context, double embedding leverages the orthogonality of DCT basis vectors and the stability of SVD singular values. Each embedding operation is algebraically orthogonal but not non-interfering, so the second embedding can induce perturbation to the first. This is managed by explicit storage or regeneration of original coefficients and careful setting of $\alpha$ .

5. Practical Implementation and Experimental Results

Experimental setups in both domains demonstrate the utility of double-tag schemes.

Watermarking (DCT+SVD): For an embedding strength $\alpha=0.05$ , watermarked images maintain PSNR ≈ 52.49 dB. After various noise attacks, PSNR degrades to 35.88–49.30 dB, yet both DCT and SVD watermarks remain detectable. The scheme resists attempts to remove a watermark by targeting a single domain, as redundancy across orthogonal representations ensures residual retrievability (Ji et al., 2015).
TNT Classifier Training: A worked example for LHC dijet resonance searches uses a convolutional autoencoder as $f_1$ and a CNN as $f_2$ , trained on 200k unlabeled events/iteration. In the LHC-Olympics dijet benchmark (1% signal), AUC of $f_2$ is elevated from 0.65 (autoencoder) to 0.82, and statistical significance for discovery is raised from ≈3σ to ≫5σ. Performance can approach that of fully supervised training if the sub-object independence assumption is satisfied (Amram et al., 2020).

Domain	Implementation Steps	Typical Outcomes
DCT+SVD Watermarking	Block DCT → diag tag → block SVD	PSNR>50 dB, robust under noise
Tag N’ Train (TNT)	Weak tag → strong classifier training	AUC ∼0.8+, σ_discovery≫5σ

6. Advantages, Limitations, and Future Directions

Advantages

In watermarking, double-tag embedding in both DCT and SVD domains yields low visual distortion pre-attack, and high resilience against additive/impulse noise. This redundancy impedes successful attacks targeting only a single domain.
In data classification, TNT allows model-agnostic, simulation-independent classifier training directly on unlabeled data, well-suited for anomaly detection in two-object systems.

Limitations

In watermarking, the dual embedding increases computational overhead and can induce minor cumulative degradation of the first tag. Storage of either original DCT coefficients or singular values is necessary.
In TNT, performance depends critically on the weak tagger’s discriminative ability and the statistical independence of sub-object features on background-only events. The method assumes two or more sub-objects per event and lacks single-object generality.

Extensions

TNT may incorporate soft (score-based) labeling, alternative unsupervised taggers (e.g., normalizing flows), and can generalize to multi-tag settings with more than two sub-objects.
Adversarial decorrelation can enforce independence from kinematic variables, such as jet $p_T$ in particle physics applications.

7. Comparative Summary

Both analyzed double-tag approaches establish that embedding, tagging, or training in two orthogonal domains or subspaces significantly enhances system robustness. In image watermarking, redundancy across frequency and singular-value subspaces fortifies watermark integrity. In unsupervised learning, bootstrapping from weak to strong taggers transforms ambiguous labels into increasingly pure classification signals. The trade-off universally observed is between increased algorithmic complexity and the gain in reliability and interpretability.

Key results and methodologies are detailed in "A DCT And SVD based Watermarking Technique To Identify Tag" (Ji et al., 2015) and "Tag N' Train: A Technique to Train Improved Classifiers on Unlabeled Data" (Amram et al., 2020).

PDF Markdown Chat (Pro)

References (2)

A DCT And SVD based Watermarking Technique To Identify Tag (2015)

Tag N' Train: A Technique to Train Improved Classifiers on Unlabeled Data (2020)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Double-Tag Technique.