ARMOR: Agentic Reasoning for Adversarial Attacks

Updated 2 February 2026

The paper introduces ARMOR, which dynamically orchestrates multiple attack primitives via AI agents to overcome static ensemble limitations.
It leverages Vision–Language and Large Language Models for real-time semantic reasoning and adaptive hyperparameter reparameterization.
Empirical results show ARMOR's superior attack success rates, outperforming conventional methods in transfer-based black-box scenarios.

Agentic Reasoning for Methods Orchestration and Reparameterization (ARMOR) is a closed-loop adversarial attack framework that leverages agentic reasoning to dynamically orchestrate and reparameterize multiple attack primitives via a society of AI agents. ARMOR integrates Vision–LLMs (VLMs) and LLMs to achieve robust, transferable, and adaptive adversarial attacks, primarily in a transfer-based black-box scenario. By synthesizing dense, sparse, and geometric perturbations adaptively through a shared "Mixing Desk," ARMOR addresses critical limitations of static ensemble attack suites, namely their lack of semantic awareness and inability to adapt to new models or exploit image-specific vulnerabilities (Rong et al., 26 Jan 2026).

1. Motivation and Problem Setting

ARMOR is motivated by the shortcomings of traditional automated attack ensembles, which operate as static sequences or combinations of fixed hyperparameter attacks (e.g., AutoAttack). These suites lack semantic reasoning capabilities, suffer from stagnation under constrained budgets, and demonstrate limited cross-architecture transferability. ARMOR specifically addresses challenges in the transfer-based black-box setting, wherein perturbations are crafted on an ensemble of surrogate models (e.g., ResNet-50, DenseNet-121) and then applied to a blind target architecture (e.g., ViT-B/16). The core needs addressed by ARMOR include dynamic allocation of attack budgets across complementary geometries (dense, sparse, geometric), semantic targeting of image regions, and real-time hyperparameter reparameterization guided by high-level vision–language reasoning (Rong et al., 26 Jan 2026).

2. Core Architecture and Components

ARMOR orchestrates three canonical adversarial attack primitives: Carlini–Wagner (CW), Jacobian-based Saliency Map Attack (JSMA), and Spatially Transformed Attack (STA). The system is composed of specialized agents, each contributing to distinct stages of the attack orchestration:

InfoAgent (VLM): Extracts semantically salient cues from the input image, such as region annotations and texture features, using models like Qwen2.5-VL.
ConductorAgent (LLM): Aggregates semantic reports and baseline confidences to set global constraints, notably the $\ell_\infty$ budget $\epsilon$ and minimum SSIM threshold $\tau$ .
AdvisorAgents (LLM): Propose hyperparameters for each attack method based on the run history $\mathcal{H}$ .
MethodAgents: Execute the CW, JSMA, and STA algorithms with the supplied hyperparameters.
MixerAgent: Implements the Mixing Desk, optimizing a convex weight vector $w\in\Delta^3$ to synthesize the output perturbation that maximizes a custom score $S(w)$ .
CritiqueAgents: Assess attack outcomes through vectors comprising black-box and surrogate confidences along with SSIM.
StrategistAgent: Detects stagnation and adaptively relaxes global constraints $(\epsilon, \tau)$ to facilitate escape from local optima.

The interaction among these agents is iterative and closed-loop, enabling the system to learn from intermediate results and to reparameterize attacks in real time (Rong et al., 26 Jan 2026).

3. Formalization of Attack Primitives

3.1 Carlini–Wagner (CW) Attack

CW adversarial examples are generated by solving: $\underset{\delta}{\min}\; J(\delta) = \|\delta\|_2^2 + c\cdot\mathcal{L}_{\mathrm{adv}}(Z(x+\delta), t)$ subject to $x+\delta\in[0,1]^d$ and $\|\delta\|_\infty\leq\epsilon$ , with projection performed at each step: $\delta\leftarrow\Pi_{\infty,\epsilon}(\delta - \eta\nabla_\delta J(\delta))$

3.2 Jacobian-based Saliency Map Attack (JSMA)

At every iteration, JSMA selects a pixel pair $(p^*,q^*)$ that maximizes adversarial saliency: $(p^*,q^*) = \arg\max_{(p,q)} [-\alpha_{pq}\cdot\beta_{pq}]$ where $\alpha_{pq} = \sum_{i\in\{p,q\}} \partial Z_t(x)/\partial x_i$ and $\beta_{pq} = \sum_{i\in\{p,q\}} \sum_{j\ne t} \partial Z_j(x)/\partial x_i$ , subject to $\alpha_{pq]>0$, $\beta_{pq}<0$ . Pixels are perturbed and projected to satisfy $\|\delta\|_\infty\leq\epsilon$ and $x+\delta\in[0,1]$ .

3.3 Spatially Transformed Attack (STA)

STA parameterizes a flow field $f$ and applies a differentiable warping: $x_{\mathrm{adv}}(f) = \mathcal{W}(x; f)$ Optimizing: $\min_f\; \mathcal{L}_{\mathrm{adv}}(x_{\mathrm{adv}}(f), t) + \theta\cdot\mathcal{L}_{\mathrm{flow}}(f)$ where $\mathcal{L}_{\mathrm{flow}}(f)$ encourages smoothness and $\theta$ is a distortion trade-off.

4. Agentic Closed-Loop Orchestration

ARMOR proceeds in discrete iterations $k=1,2,\ldots$ , with the following stages:

Reconnaissance: InfoAgent analyzes the input and semantic cues. Baseline confidences $p_{\text{base}}$ (for surrogate and black-box) are evaluated.
Objective Formulation: ConductorAgent sets $(\epsilon,\tau)$ according to semantic features and $p_{\text{base}}$ .
Parallel Perturbation Generation: For each method $m\in\{\mathrm{CW, JSMA, STA}\}$ , AdvisorAgents propose method-specific hyperparameters $\varphi_m$ , and MethodAgents generate $\delta_m$ .
Adaptive Perturbation Ensemble: MixerAgent performs randomized hill-climbing over simplex $\Delta^3$ to maximize

$S(w) = \lambda p_{bb}(t|x_{\mathrm{master}}(w)) + (1-\lambda)\bar{p}_{\mathrm{surr}}(t|x_{\mathrm{master}}(w)) - \mu\cdot\max(0,\tau-\mathrm{SSIM}(x,x_{\mathrm{master}}(w)))$

Composite perturbation: $\delta_{\mathrm{mix}}(w) = w_1\delta_{\mathrm{CW}} + w_2\delta_{\mathrm{JSMA}} + w_3\delta_{\mathrm{STA}}$ with $\sum w_i=1$ .

Critique and Strategic Adaptation: CritiqueAgents append performance vectors to $\mathcal{H}$ . StrategistAgent detects stagnation using sliding window statistics. If necessary, constraints are relaxed: $\tau\gets \max(\tau_{\min}, \tau-\Delta_\tau)$ , $\epsilon\gets \min(\epsilon_{\max}, \epsilon + \Delta_\epsilon)$ .

The process repeats until the attack succeeds or resource constraints are met.

5. Reparameterization via Advisor Agents

AdvisorAgents implement a gradient-free search over method- and globally-specific hyperparameters:

CW: $\varphi_{\mathrm{CW}}=(c, \eta)$
JSMA: $\varphi_{\mathrm{JSMA}}=(\alpha, k)$
STA: $\varphi_{\mathrm{STA}}=(\gamma, T_{sta}, S_{min}, S_{max}, \theta)$
Global: $(\epsilon, \tau)$

Given the historical success/failure vectors, AdvisorAgents sample local modifications, predict impact using learned/heuristic rules, and select those improving the expected $S(w^*)$ . The critique vector $c^{(k)}$ supplies the reinforcement signal, incentivizing increments in black-box confidence $p_{bb}$ and SSIM. This mechanism is analogous to hill-climbing or evolutionary strategies, albeit directed by both textual and numeric signals (Rong et al., 26 Jan 2026).

6. Empirical Evaluation

ARMOR was evaluated on the AADD-LQ dataset (710 low-quality fake images at $224\times224$ resolution), attacking a blind ViT-B/16 with surrogates (ResNet-50, DenseNet-121), and constrained to $\ell_\infty\leq 8/255$ , max 10,000 queries per image. Key metrics include Attack Success Rate (ASR), SSIM-weighted ASR (wASR), and transfer indicators $p_{\mathrm{surr}}$ , $p_{\mathrm{tgt}}$ , $p_{\mathrm{cond}}$ , with $\Delta$ ASR $=p_{\mathrm{surr}}-p_{\mathrm{tgt}}$ .

Results:

On surrogates, ARMOR achieved perfect ASR=$1.000$, wASR $\approx0.98$ .
On blind ViT-B/16, ARMOR achieved ASR=$0.396$ (wASR=$0.280$, SSIM=$0.701$), over double the next-best ensemble (AutoAttack-PGD, ASR $\approx0.196$ ). Transfer-based attacks such as MI-FGSM and DI-FGSM dropped to ASR $<0.07$ .
Transferability: ARMOR attained $p_{\mathrm{surr}}=1.000,\,p_{\mathrm{tgt}}=0.396,\,p_{\mathrm{cond}}=0.396$ , $\Delta$ ASR $=0.604$ , outperforming TI-FGSM ( $p_{\mathrm{cond}}=0.165$ ) and AutoAttack-PGD ( $p_{\mathrm{cond}}=0.196$ ).

Ablations showed that removing the InfoAgent, or agentic orchestration entirely, reduced transfer ASR to nearly zero (ASR $\approx0.006{-}0.01$ ), establishing the necessity of agentic multi-agent reasoning and hyperparameter adaptation (Rong et al., 26 Jan 2026).

7. Limitations and Future Directions

ARMOR incurs significant computational overhead due to the parallel execution of three gradient-based attacks, real-time LLM/VLM queries, and ensemble hill-climbing, resulting in high GPU and multi-node costs. Generalization to domains beyond low-quality fake images (e.g., natural images or video) has yet to be established. Several future research directions are noted:

Incorporation of perceptual similarity metrics beyond SSIM, such as LPIPS, to further enhance fidelity-performance trade-offs.
Extension to additional attack modalities, including generative-model-based and patch attacks.
Exploration of defenses purpose-built to detect the coordination signals emergent from multi-agent attack orchestration (Rong et al., 26 Jan 2026).

ARMOR reconceptualizes adversarial attack generation as a collaborative, multi-agent reasoning process, yielding enhanced transferability and robustness relative to static attack ensembles.

Markdown Report Issue Upgrade to Chat

References (1)

ARMOR: Agentic Reasoning for Methods Orchestration and Reparameterization for Robust Adversarial Attacks (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Agentic Reasoning for Methods Orchestration and Reparameterization (ARMOR).

ARMOR: Agentic Reasoning for Adversarial Attacks

1. Motivation and Problem Setting

2. Core Architecture and Components

3. Formalization of Attack Primitives

3.1 Carlini–Wagner (CW) Attack

3.2 Jacobian-based Saliency Map Attack (JSMA)

3.3 Spatially Transformed Attack (STA)

4. Agentic Closed-Loop Orchestration

5. Reparameterization via Advisor Agents

6. Empirical Evaluation

7. Limitations and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

ARMOR: Agentic Reasoning for Adversarial Attacks

1. Motivation and Problem Setting

2. Core Architecture and Components

3. Formalization of Attack Primitives

3.1 Carlini–Wagner (CW) Attack

3.2 Jacobian-based Saliency Map Attack (JSMA)

3.3 Spatially Transformed Attack (STA)

4. Agentic Closed-Loop Orchestration

5. Reparameterization via Advisor Agents

6. Empirical Evaluation

7. Limitations and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research