Entropy-bank Guided Adversarial Attacks

Updated 2 January 2026

EGA is a targeted adversarial method that exploits token-level uncertainty in vision-language models to efficiently alter output trajectories.
It selects approximately 20% of high-entropy tokens and uses a reusable entropy bank to improve attack transferability across different model architectures.
Empirical evaluations show over 93% attack success and a significant increase in harmful content conversion compared to global perturbation methods.

Entropy-bank Guided Adversarial Attacks (EGA) refer to a targeted adversarial methodology designed to exploit uncertainty-driven vulnerabilities within autoregressive vision-LLMs (VLMs). EGA leverages the observation that only a small subset (approximately 20%) of token prediction steps—marked by high token-level entropy—primarily governs output trajectories. By concentrating perturbations at these “critical decision points,” EGA enables both model-specific and cross-model semantic degradation with greater efficiency and transferability than prior global or random attacks. The approach is distinguished by the construction of a reusable entropy bank, allowing for selective, high-impact targeting of tokens that are empirically most susceptible to adversarial manipulation (He et al., 26 Dec 2025).

1. Entropy in Vision-LLM Decoding

In the context of EGA, entropy serves as a measure of prediction uncertainty at each step of autoregressive VLM generation. For a VLM $f_\theta : I \to \{y_1, \ldots, y_T\}$ that autoregressively produces a token sequence conditioned on an image $I$ , the token-level entropy at time $t$ is given by

$H_t := -\sum_{w \in \mathcal{V}} p_t(w) \log p_t(w)$

where $p_t(w) = P(y_t = w ~|~ I, y_{<t})$ is the model’s predicted probability for token $w$ at step $t$ and $\mathcal{V}$ is the vocabulary. A single teacher-forced pass on clean images yields an entropy profile $\{H_1, ..., H_T\}$ , quantifying generative instability at each position.

2. Selection and Characterization of High-Entropy Tokens

EGA identifies the tokens responsible for the majority of generation “forks” by selecting the top quantile of positions with the highest entropies. Specifically, entropies are ranked in descending order, and a set $S_q = \{\sigma(1), ..., \sigma(k)\}$ is formed, where $k = \lceil q \cdot T \rceil$ and $q$ is the selection ratio (default $q=0.20$ ). Alternatively, a threshold $\tau$ can define the set $S = \{ t \,|\, H_t \geq \tau \}$ with $|S|/T \approx 0.20$ . Empirical findings establish that this ≈20% subset governs the bulk of generative trajectory changes, thus presenting an optimal attack surface.

3. Construction and Integration of the Entropy Bank

EGA constructs a reusable entropy bank $B$ to generalize across images and models. This bank comprises the top $K$ tokens from the vocabulary $\mathcal{V}$ with the highest empirical “flip-rate,” calculated as:

$\mathrm{FlipRate}(w) = \frac{\#\,\text{images where token } w \text{ is replaced in a high-entropy position}}{\text{total images}}$

Tokens are ranked by flip-rate, and the bank $B$ consists of the top- $K$ most “flippable.” At attack time, per-image high-entropy masks $S_q$ are augmented with token positions in the bank: $S_\text{bank} = \{ t ~|~ \hat{y}_t \in B \}$ , yielding the mask $S_\text{tr} = S_q \cup S_\text{bank}$ . This augmentation increases attack effectiveness and enables transferability across VLM architectures.

4. EGA Attack Optimization Procedure

EGA optimizes adversarial perturbations by maximizing the average entropy over the selected positions $S_\text{tr}$ , under an $L_\infty$ norm constraint on the pixel-level perturbation:

$\max_\delta \frac{1}{|S_\text{tr}|} \sum_{t \in S_\text{tr}} H_t (f_\theta (v_0 + \delta, \hat{y}_{<t}) ) \qquad \text{subject to} \quad \|\delta\|_\infty \leq \varepsilon$

where $v_0 = \psi(I)$ is the normalized pixel input and $\varepsilon=8/255$ is the standard perturbation budget. Projected gradient ascent with momentum (or Adam optimizer) is performed over 300 steps, with periodic mask refresh (every $R=50$ steps). Exact pseudocode is provided in the source (He et al., 26 Dec 2025), specifying key steps: initial greedy decoding, entropy computation, mask formation, iterative gradient updates, and final adversarial decoding.

5. Empirical Results and Baseline Comparisons

Quantitative evaluation on standard VLMs (Qwen2.5-VL-7B-Instruct, InternVL3.5-4B, LLaVA-1.5-7B) demonstrates that EGA achieves attack success rates (ASR) of $93.1\%$ – $94.8\%$ with $\Delta$ CIDEr scores near 0.85–0.88. Crucially, EGA induces harmful content (violence, self-harm, hate, etc.) in $37.3\%$ – $47.1\%$ of outputs—more than doubling the harmful rate achieved by global-entropy (MIE) baselines ( $14\%$ – $23\%$ ) and greatly exceeding standard methods such as PGD, VLA, and COA (all $<2\%$ harmful conversion). In visual question answering, similar patterns are observed, with EGA producing $25\%$ – $29\%$ harmful conversions and retaining $>80\%$ ASR. Transferability metrics show EGA achieves $17\%$ – $26\%$ harmful rates on unseen target models, compared with $<12\%$ for transferred XTA or MIE attacks.

Model	ASR (%)	ΔCIDEr	Harmful Rate (%)
Qwen2.5-VL-7B-Instruct	94.8	0.88	42.5
InternVL3.5-4B	93.8	0.86	37.3
LLaVA-1.5-7B	93.1	0.85	47.1

6. Key Insights and Implications

Empirical findings reveal that approximately 20% of tokens with highest entropy govern the majority of generative variability. Restricting adversarial perturbations to these “decision tokens” not only preserves attack efficacy but leads to disproportionate semantic drift—enabling conversion of $35\%$ – $49\%$ of benign outputs to harmful content, a vulnerability underreported by prior global-entropy attacks. High-entropy tokens show recurrence across different VLM architectures, enabling practical cross-model transfer of attacks.

A salient implication is that robustness interventions should focus on stabilizing next-token distributions specifically at high-entropy forks, for example, via uncertainty regularization or dynamic token masking, instead of merely increasing average-case robustness. EGA thus exposes a concentrated vulnerability in modern VLM decoding and motivates reevaluation of multimodal safety mechanisms (He et al., 26 Dec 2025).

7. Practical Considerations and Implementation

EGA is implemented under a standardized $L_\infty$ -norm constraint on normalized pixels ( $\varepsilon=8/255$ as default), with optimizer options including Adam ( $\beta_1=0.9$ , $\beta_2=0.999$ ) or momentum-based PGD ( $\mu=0.6$ ). Perturbations are iteratively updated, and the selected entropy mask may be optionally refreshed to account for evolving generative trajectories. Parameters such as token budget ( $q=0.20$ ), entropy bank size ( $K=100$ ), and mask refresh interval ( $R=50$ ) are configurable, with efficacy robust across reasonable variations. Decoding is performed greedily with typical sequence length restrictions.

A plausible implication is that the EGA framework and entropy bank concept may be adapted to broader autoregressive, multimodal, or sequence-generation settings where token-level uncertainty can be exploited for selective adversarial control.

Markdown Report Issue Upgrade to Chat

References (1)

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Entropy-bank Guided Adversarial Attacks (EGA).

Entropy-bank Guided Adversarial Attacks

1. Entropy in Vision-LLM Decoding

2. Selection and Characterization of High-Entropy Tokens

3. Construction and Integration of the Entropy Bank

4. EGA Attack Optimization Procedure

5. Empirical Results and Baseline Comparisons

6. Key Insights and Implications

7. Practical Considerations and Implementation

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Entropy-bank Guided Adversarial Attacks

1. Entropy in Vision-LLM Decoding

2. Selection and Characterization of High-Entropy Tokens

3. Construction and Integration of the Entropy Bank

4. EGA Attack Optimization Procedure

5. Empirical Results and Baseline Comparisons

6. Key Insights and Implications

7. Practical Considerations and Implementation

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research