Papers
Topics
Authors
Recent
2000 character limit reached

Directional Orthogonal Counterattack (DOC)

Updated 14 November 2025
  • DOC is a test-time defense mechanism that augments PGD by integrating orthogonal gradient exploration and momentum to better navigate the adversarial search space.
  • It introduces a Directional Sensitivity Score (DSS) for adaptive gating, dynamically blending optimized and random perturbations to mitigate overfitting on clean inputs.
  • Empirical results show DOC achieves robust accuracy gains (up to 31% on PGD-10) while maintaining competitive clean performance across multiple datasets.

Directional Orthogonal Counterattack (DOC) is a test-time defense mechanism designed to enhance the adversarial robustness of vision-language pre-training models, particularly CLIP, by expanding the diversity and coverage of counterattacks. Unlike standard Test-Time Counterattack (TTC) methods that rely solely on Projected Gradient Descent (PGD) steps, DOC introduces orthogonal exploration and momentum-based updates to address the limited search space and overfitting issues associated with traditional counterattack strategies. The method incorporates a novel directional sensitivity score for adaptive modulation of counterattack strength, ultimately improving the model's ability to neutralize a broader array of adversarial perturbations while maintaining competitive performance on clean data.

1. Optimization Objective and Update Scheme

DOC operates in the context of a possibly adversarial input xx, using the CLIP image encoder Iθ()I_\theta(\cdot) and a perturbation δca\delta_{\mathrm{ca}} constrained within an p\ell_p budget ϵca\epsilon_{\mathrm{ca}}. The objective extends the TTC maximization formulation:

δca=argmaxδcapϵcaIθ(xadv+δca)Iθ(xadv)2\delta_{\mathrm{ca}}^* = \arg\max_{\|\delta_{\mathrm{ca}}\|_p \le \epsilon_{\mathrm{ca}}} \left\| I_\theta(x_{\mathrm{adv}} + \delta_{\mathrm{ca}}) - I_\theta(x_{\mathrm{adv}}) \right\|_2

DOC modifies the classic PGD iterative update by:

  • Augmenting the standard loss gradient with an orthogonal random direction and a momentum buffer.
  • Introducing a Directional Sensitivity Score (DSS) for adaptive modulation.

The iterative steps at each round tt are as follows:

  1. Surrogate loss:

L(xadv,δca)=Iθ(xadv+δca)Iθ(xadv)2\mathcal{L}(x_{\mathrm{adv}}, \delta_{\mathrm{ca}}) = \|I_\theta(x_{\mathrm{adv}} + \delta_{\mathrm{ca}}) - I_\theta(x_{\mathrm{adv}})\|_2

  1. Normalized gradient:

gt=xadvL(xadv,δcat)xadvL(xadv,δcat)2g_t = \frac{\nabla_{x_{\mathrm{adv}}} \mathcal{L}(x_{\mathrm{adv}}, \delta^t_{\mathrm{ca}})}{\|\nabla_{x_{\mathrm{adv}}} \mathcal{L}(x_{\mathrm{adv}}, \delta^t_{\mathrm{ca}})\|_2}

  1. Orthogonal random augmentation:

    • Sample rN(0,I)r \sim \mathcal{N}(0, I),
    • Project to orthogonal subspace:

    rt=rr,gtgtrr,gtgt2withrt,gt=0r^\perp_t = \frac{r - \langle r, g_t \rangle g_t}{\|r - \langle r, g_t \rangle g_t\|_2} \quad\text{with}\quad \langle r^\perp_t, g_t \rangle = 0

  2. Composite direction:

dt=gt+λrtd_t = g_t + \lambda r^\perp_t

  1. Momentum buffer update:

mt=μmt1+(1μ)dtm_t = \mu m_{t-1} + (1-\mu)\,d_t

  1. PGD step and projection:

δcat+1=Πpϵca(δcat+αsign(mt))\delta_{\mathrm{ca}}^{t+1} = \Pi_{\|\cdot\|_p \le \epsilon_{\mathrm{ca}}} \left( \delta_{\mathrm{ca}}^{t} + \alpha\,\mathrm{sign}(m_t) \right)

Where λ\lambda is the orthogonal strength, μ\mu is the momentum factor, α\alpha is the step size, and Π\Pi denotes projection onto the p\ell_p-norm ball.

2. Complete Algorithmic Workflow

The DOC method is structured as follows, combining orthogonal augmentation with a sensitivity-based gating mechanism:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
tau_cos = 0
for m in range(1, M+1):
    eta_m = Uniform(-epsilon_ca, epsilon_ca)
    x_m = x_input + eta_m
    tau_cos += cos(I_theta(x_m), I_theta(x_input))
hat_tau = 1 - (tau_cos / M)
w = sigmoid(gamma * (tau - hat_tau))

m_0 = 0
delta_ca_0 = Uniform(-epsilon_ca, epsilon_ca)
delta_ca = delta_ca_0
for t in range(T):
    # Step 1: Compute normalized gradient g_t
    # Step 2: Sample random r, compute r^perp
    # Step 3: Compose d_t, update m_t
    # Step 4: PGD update and project
    # (as above)
delta_ca_T = delta_ca

final_delta_ca = w * delta_ca_T + (1-w) * delta_ca_0
return final_delta_ca

The final counterattack perturbation is a convex combination of the fully optimized and the initial PGD random noise, weighted by the DSS-derived gate ww.

3. Orthogonal Augmentation and Momentum: Rationale and Mathematical Construction

Standard PGD counterattacks operate along xL\nabla_x \mathcal{L}, which, for the TTC objective, may only explore a restricted subset of the p\ell_p-ball. The key innovation in DOC is the addition of noise lying in the subspace orthogonal to the loss gradient:

  • Any vector rr can be decomposed as r=r,gg+(rr,gg)r = \langle r, g\rangle g + (r - \langle r, g\rangle g).
  • The DOC update discards the parallel component, forming rr^\perp, and injects it with strength λ\lambda.
  • This composite step, especially after normalization, enables the optimizer to escape narrow basins and locate more generalizable counterattacks.

Momentum μ\mu acts as a low-pass filter, amplifying directions that consistently improve the objective while filtering high-frequency steps, yielding more stable and effective traversal across the counterattack search space.

Empirical ablation demonstrates that orthogonal gradient augmentation (OGA) constitutes the primary driver for increased adversarial robustness, as evidenced by robust accuracy ≥31% (vs. ~21% for TTC) with only small reductions in clean accuracy.

4. Directional Sensitivity Score (DSS) and Adaptive Modulation

DSS is designed to quantify an input's embedding sensitivity to isotropic random perturbations, independent of embedding scale. For each input xx, MM random noise samples ηm\eta^m are generated, and the average cosine similarity between the perturbed and clean embeddings is computed:

τ^(x)=11Mm=1Mcos(Iθ(x+ηm),Iθ(x))\hat\tau(x) = 1 - \frac{1}{M} \sum_{m=1}^M \cos\left( I_\theta(x + \eta^m), I_\theta(x) \right)

Clean images yield τ^\hat\tau near zero (cosine similarity close to 1), while adversarial images show increased embedding drift (τ^\hat\tau larger). Rather than employing hard thresholding, a soft gate ww is computed:

w=σ(γ[ττ^(x)])w = \sigma\left(\gamma [\tau - \hat\tau(x)]\right)

where γ\gamma is a sharpness hyperparameter and τ\tau a preset threshold. The final output is a convex blend: δca=wδT+(1w)δca0\delta_{\mathrm{ca}} = w\,\delta_T + (1-w)\,\delta^0_{\mathrm{ca}}. This adaptivity prevents excessive perturbation of already-stable (clean) inputs and ensures robust correction on sensitive (likely adversarial) samples.

5. Empirical Evaluation and Results

DOC was evaluated on 16 diverse datasets, including CIFAR-10/100, STL-10, ImageNet (zero-shot), Caltech-101/256, Oxford Pets, Flowers102, Food101, Stanford Cars, SUN397, Country211, FGVC-Aircraft, EuroSAT, DTD, and PCAM.

Three adversarial threat models were applied:

  • PGD-10 (\ell_\infty, ϵatk=4/255\epsilon_{\mathrm{atk}}=4/255)
  • C&W (\ell_\infty, ϵ=4/255\epsilon=4/255)
  • AutoAttack ensemble (\ell_\infty, ϵ=4/255\epsilon=4/255)

Key robust and clean accuracy results are summarized below:

Setting Robust Accuracy (%) Clean Accuracy (%)
CLIP Baseline 0.06 61.5
TTC (prior state-of-the-art) 21.22 (PGD-10) 55.6
DOC (full method) 31.02 (PGD-10) 58.3
DOC (C&W) 28.18
DOC (AutoAttack) ≈ 21.0
  • DOC yields a +9.8% robust accuracy gain (PGD-10) over TTC, and +30.96% over baseline CLIP.
  • Ablation reveals OGA alone provides ≈31.8% robust, ≈55.4% clean; DSS alone achieves ≈23.4% robust, ≈58.2% clean; full combination balances both metrics.
  • Integration with adversarial fine-tuning (e.g., TeCoA, PMG-AFT, FARE) delivers a further 4–5% robust improvement.

The number of update steps TT saturates at 3–4, with default hyperparameter settings (ϵca=4/255\epsilon_{\mathrm{ca}}=4/255, α=3/255\alpha=3/255, λ=1.0\lambda=1.0, μ=0.55\mu=0.55, M=5M=5, τ=0.5\tau=0.5, γ=10\gamma=10). All experiments use batch size 256 and a single NVIDIA 4090 GPU.

6. Practical Deployment Recommendations and Limitations

DOC requires no additional training but does induce test-time computational overhead proportional to TT and MM.

  • Step count T=3T=3–$4$ is sufficient; further iterations provide negligible benefits.
  • Orthogonal strength λ[0.5,1.5]\lambda \in [0.5, 1.5] and momentum μ[0.4,0.7]\mu \in [0.4, 0.7] are robust intervals.
  • Sensitivity sampling M=5M=5 offers a stability-cost compromise; M[3,10]M \in [3, 10] is viable.
  • Soft gating with γ10\gamma \approx 10–$20$ smooths abrupt transitions.
  • No formal guarantee of global optimality is provided; efficacy depends on the geometry of the CLIP embedding space.
  • The method is directly applicable only to models with differentiable embedding functions and tractable p\ell_p projections.
  • Potential extensions include adaptive orthogonal scaling, refined subspace sampling, and adaptation to alternative visio-linguistic backbones.

DOC advances test-time counterattack methodology by systematically incorporating orthogonality, momentum, and input-sensitive gating, leading to substantial gains in adversarial robustness with minimal sacrifice in clean evaluation accuracy.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Directional Orthogonal Counterattack (DOC).