Directional Orthogonal Counterattack (DOC)

Updated 14 November 2025

DOC is a test-time defense mechanism that augments PGD by integrating orthogonal gradient exploration and momentum to better navigate the adversarial search space.
It introduces a Directional Sensitivity Score (DSS) for adaptive gating, dynamically blending optimized and random perturbations to mitigate overfitting on clean inputs.
Empirical results show DOC achieves robust accuracy gains (up to 31% on PGD-10) while maintaining competitive clean performance across multiple datasets.

Directional Orthogonal Counterattack (DOC) is a test-time defense mechanism designed to enhance the adversarial robustness of vision-language pre-training models, particularly CLIP, by expanding the diversity and coverage of counterattacks. Unlike standard Test-Time Counterattack (TTC) methods that rely solely on Projected Gradient Descent (PGD) steps, DOC introduces orthogonal exploration and momentum-based updates to address the limited search space and overfitting issues associated with traditional counterattack strategies. The method incorporates a novel directional sensitivity score for adaptive modulation of counterattack strength, ultimately improving the model's ability to neutralize a broader array of adversarial perturbations while maintaining competitive performance on clean data.

1. Optimization Objective and Update Scheme

DOC operates in the context of a possibly adversarial input $x$ , using the CLIP image encoder $I_\theta(\cdot)$ and a perturbation $\delta_{\mathrm{ca}}$ constrained within an $\ell_p$ budget $\epsilon_{\mathrm{ca}}$ . The objective extends the TTC maximization formulation:

$\delta_{\mathrm{ca}}^* = \arg\max_{\|\delta_{\mathrm{ca}}\|_p \le \epsilon_{\mathrm{ca}}} \left\| I_\theta(x_{\mathrm{adv}} + \delta_{\mathrm{ca}}) - I_\theta(x_{\mathrm{adv}}) \right\|_2$

DOC modifies the classic PGD iterative update by:

Augmenting the standard loss gradient with an orthogonal random direction and a momentum buffer.
Introducing a Directional Sensitivity Score (DSS) for adaptive modulation.

The iterative steps at each round $t$ are as follows:

Surrogate loss:

$\mathcal{L}(x_{\mathrm{adv}}, \delta_{\mathrm{ca}}) = \|I_\theta(x_{\mathrm{adv}} + \delta_{\mathrm{ca}}) - I_\theta(x_{\mathrm{adv}})\|_2$

Normalized gradient:

$g_t = \frac{\nabla_{x_{\mathrm{adv}}} \mathcal{L}(x_{\mathrm{adv}}, \delta^t_{\mathrm{ca}})}{\|\nabla_{x_{\mathrm{adv}}} \mathcal{L}(x_{\mathrm{adv}}, \delta^t_{\mathrm{ca}})\|_2}$

Orthogonal random augmentation:
- Sample $r \sim \mathcal{N}(0, I)$ ,
- Project to orthogonal subspace:
$r^\perp_t = \frac{r - \langle r, g_t \rangle g_t}{\|r - \langle r, g_t \rangle g_t\|_2} \quad\text{with}\quad \langle r^\perp_t, g_t \rangle = 0$
Composite direction:

$d_t = g_t + \lambda r^\perp_t$

Momentum buffer update:

$m_t = \mu m_{t-1} + (1-\mu)\,d_t$

PGD step and projection:

$\delta_{\mathrm{ca}}^{t+1} = \Pi_{\|\cdot\|_p \le \epsilon_{\mathrm{ca}}} \left( \delta_{\mathrm{ca}}^{t} + \alpha\,\mathrm{sign}(m_t) \right)$

Where $\lambda$ is the orthogonal strength, $\mu$ is the momentum factor, $\alpha$ is the step size, and $\Pi$ denotes projection onto the $\ell_p$ -norm ball.

2. Complete Algorithmic Workflow

The DOC method is structured as follows, combining orthogonal augmentation with a sensitivity-based gating mechanism:

tau_cos = 0
for m in range(1, M+1):
    eta_m = Uniform(-epsilon_ca, epsilon_ca)
    x_m = x_input + eta_m
    tau_cos += cos(I_theta(x_m), I_theta(x_input))
hat_tau = 1 - (tau_cos / M)
w = sigmoid(gamma * (tau - hat_tau))

m_0 = 0
delta_ca_0 = Uniform(-epsilon_ca, epsilon_ca)
delta_ca = delta_ca_0
for t in range(T):
    # Step 1: Compute normalized gradient g_t
    # Step 2: Sample random r, compute r^perp
    # Step 3: Compose d_t, update m_t
    # Step 4: PGD update and project
    # (as above)
delta_ca_T = delta_ca

final_delta_ca = w * delta_ca_T + (1-w) * delta_ca_0
return final_delta_ca

The final counterattack perturbation is a convex combination of the fully optimized and the initial PGD random noise, weighted by the DSS-derived gate $w$ .

3. Orthogonal Augmentation and Momentum: Rationale and Mathematical Construction

Standard PGD counterattacks operate along $\nabla_x \mathcal{L}$ , which, for the TTC objective, may only explore a restricted subset of the $\ell_p$ -ball. The key innovation in DOC is the addition of noise lying in the subspace orthogonal to the loss gradient:

Any vector $r$ can be decomposed as $r = \langle r, g\rangle g + (r - \langle r, g\rangle g)$ .
The DOC update discards the parallel component, forming $r^\perp$ , and injects it with strength $\lambda$ .
This composite step, especially after normalization, enables the optimizer to escape narrow basins and locate more generalizable counterattacks.

Momentum $\mu$ acts as a low-pass filter, amplifying directions that consistently improve the objective while filtering high-frequency steps, yielding more stable and effective traversal across the counterattack search space.

Empirical ablation demonstrates that orthogonal gradient augmentation (OGA) constitutes the primary driver for increased adversarial robustness, as evidenced by robust accuracy ≥31% (vs. ~21% for TTC) with only small reductions in clean accuracy.

4. Directional Sensitivity Score (DSS) and Adaptive Modulation

DSS is designed to quantify an input's embedding sensitivity to isotropic random perturbations, independent of embedding scale. For each input $x$ , $M$ random noise samples $\eta^m$ are generated, and the average cosine similarity between the perturbed and clean embeddings is computed:

$\hat\tau(x) = 1 - \frac{1}{M} \sum_{m=1}^M \cos\left( I_\theta(x + \eta^m), I_\theta(x) \right)$

Clean images yield $\hat\tau$ near zero (cosine similarity close to 1), while adversarial images show increased embedding drift ( $\hat\tau$ larger). Rather than employing hard thresholding, a soft gate $w$ is computed:

$w = \sigma\left(\gamma [\tau - \hat\tau(x)]\right)$

where $\gamma$ is a sharpness hyperparameter and $\tau$ a preset threshold. The final output is a convex blend: $\delta_{\mathrm{ca}} = w\,\delta_T + (1-w)\,\delta^0_{\mathrm{ca}}$ . This adaptivity prevents excessive perturbation of already-stable (clean) inputs and ensures robust correction on sensitive (likely adversarial) samples.

5. Empirical Evaluation and Results

DOC was evaluated on 16 diverse datasets, including CIFAR-10/100, STL-10, ImageNet (zero-shot), Caltech-101/256, Oxford Pets, Flowers102, Food101, Stanford Cars, SUN397, Country211, FGVC-Aircraft, EuroSAT, DTD, and PCAM.

Three adversarial threat models were applied:

PGD-10 ( $\ell_\infty$ , $\epsilon_{\mathrm{atk}}=4/255$ )
C&W ( $\ell_\infty$ , $\epsilon=4/255$ )
AutoAttack ensemble ( $\ell_\infty$ , $\epsilon=4/255$ )

Key robust and clean accuracy results are summarized below:

Setting	Robust Accuracy (%)	Clean Accuracy (%)
CLIP Baseline	0.06	61.5
TTC (prior state-of-the-art)	21.22 (PGD-10)	55.6
DOC (full method)	31.02 (PGD-10)	58.3
DOC (C&W)	28.18	—
DOC (AutoAttack)	≈ 21.0	—

DOC yields a +9.8% robust accuracy gain (PGD-10) over TTC, and +30.96% over baseline CLIP.
Ablation reveals OGA alone provides ≈31.8% robust, ≈55.4% clean; DSS alone achieves ≈23.4% robust, ≈58.2% clean; full combination balances both metrics.
Integration with adversarial fine-tuning (e.g., TeCoA, PMG-AFT, FARE) delivers a further 4–5% robust improvement.

The number of update steps $T$ saturates at 3–4, with default hyperparameter settings ( $\epsilon_{\mathrm{ca}}=4/255$ , $\alpha=3/255$ , $\lambda=1.0$ , $\mu=0.55$ , $M=5$ , $\tau=0.5$ , $\gamma=10$ ). All experiments use batch size 256 and a single NVIDIA 4090 GPU.

6. Practical Deployment Recommendations and Limitations

DOC requires no additional training but does induce test-time computational overhead proportional to $T$ and $M$ .

Step count $T=3$ –$4$ is sufficient; further iterations provide negligible benefits.
Orthogonal strength $\lambda \in [0.5, 1.5]$ and momentum $\mu \in [0.4, 0.7]$ are robust intervals.
Sensitivity sampling $M=5$ offers a stability-cost compromise; $M \in [3, 10]$ is viable.
Soft gating with $\gamma \approx 10$ –$20$ smooths abrupt transitions.
No formal guarantee of global optimality is provided; efficacy depends on the geometry of the CLIP embedding space.
The method is directly applicable only to models with differentiable embedding functions and tractable $\ell_p$ projections.
Potential extensions include adaptive orthogonal scaling, refined subspace sampling, and adaptation to alternative visio-linguistic backbones.

DOC advances test-time counterattack methodology by systematically incorporating orthogonality, momentum, and input-sensitive gating, leading to substantial gains in adversarial robustness with minimal sacrifice in clean evaluation accuracy.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Directional Orthogonal Counterattack (DOC).