S2AP: Sharpness-aware Adversarial Pruning

Updated 28 October 2025

The paper introduces a min–max optimization that leverages adversarial score-space perturbations to produce flatter loss landscapes and stable mask selection.
It presents a systematic adversarial pruning methodology that integrates score-space sharpness reduction into existing pipelines, yielding a 1–2% improvement in robust accuracy.
The approach is validated on multiple datasets and architectures, demonstrating reduced Hessian eigenvalues and enhanced mask stability, especially under high sparsity.

Score-space Sharpness-aware Adversarial Pruning (S2AP) is a plug-in adversarial pruning approach that minimizes the sharpness of the robust loss function in the space of importance scores used for weight selection. Rather than optimizing scores solely for robust loss, S2AP explicitly incorporates adversarial perturbations in score-space during the mask search phase, flattening the underlying loss landscape and stabilizing mask selection. This yields pruned neural networks with improved adversarial robustness and higher mask stability, and can be integrated into any score-based adversarial pruning pipeline.

1. Motivation: Mask Instability and Sharp Score-space Minima

Adversarial pruning methods compress neural networks while preserving robustness by selecting a subset of weights according to their importance scores, with the mask determined by the top-k scores. The optimization procedure for mask selection typically minimizes robust loss with respect to the scores. However, the loss surface in score-space is highly non-smooth, exhibiting extremely sharp local minima around mask transitions. Small changes to scores can abruptly alter the mask, resulting in large loss variations and unstable mask selection—a phenomenon that degrades adversarial robustness, particularly at high sparsity.

S2AP addresses this by actively seeking flatter regions of the score-space loss surface. By minimizing both the robust loss and its sharpness with respect to score perturbations, S2AP regularizes against unstable mask configurations.

2. Score-space Sharpness Minimization Formulation

S2AP reframes mask selection as a min–max optimization over continuous scores $s \in \mathbb{R}^p$ (one per parameter):

$s^* \in \arg\min_{s} \max_{z} \; \hat{\mathcal{L}}(w \odot M(s+z, k))$

subject to

$\|z^{(l)}\| \leq \gamma\|s^{(l)}\|, \quad \forall\ l$

where:

$\hat{\mathcal{L}}(\cdot)$ is a robust loss incorporating adversarial input perturbations (e.g., via PGD).
$w$ is the fixed pre-trained weight vector.
$M(s, k)$ is the binary mask function selecting the top $k$ entries of $s$ .
$z$ is an adversarial perturbation on the scores, constrained layer-wise by $\gamma$ .

The min–max objective forces importance scores $s^*$ toward regions where $\hat{\mathcal{L}}$ is flat in score-space—that is, where adversarial perturbations $z$ cannot produce large loss increases. This directly reduces the maximum eigenvalue $\lambda_{\max}$ of the score-space loss Hessian evaluated at $s^*$ , quantifying sharpness reduction.

3. Optimization Pipeline

The S2AP mask selection process comprises:

Initialization:
- Set scores $s$ proportional to the robust model weights.
Adversarial Example Generation:
- For each input batch, generate adversarial examples using the current pruned model (with mask $M(s, k)$ ).
Score-space Adversarial Perturbation:
- Update $z$ by performing a projected gradient ascent on $\hat{\mathcal{L}}(s+z)$ , normalizing per layer and projecting onto $B(\gamma\|s^{(l)}\|)$ .
Score Update:
- Update $s$ via normalized gradient descent on $\hat{\mathcal{L}}(s+z)$ ; then subtract $z$ to maintain the base score.
Mask Determination:
- After optimization, fix $m^* = M(s^*, k)$ for mask selection.
Finetuning:
- Perform adversarial weight perturbation (AWP) minimization on the pruned model for post-mask robustness tuning.

This procedure can be appended to any existing adversarial pruning strategy (e.g. HARP, HYDRA, RLTH) as a plug-in.

4. Experimental Evaluation and Metrics

Extensive experiments confirm S2AP's benefits across datasets (CIFAR-10, SVHN, ImageNet), architectures (ResNet, VGG, WideResNet, ViT), and sparsity levels (up to 99%). Key results include:

Robust Accuracy: S2AP consistently yields 1–2 percentage points higher robust accuracy under adversarial attack compared to baselines.
Sharpness Reduction: The largest score-space Hessian eigenvalue ( $\lambda_{\max}$ ) is lower for S2AP-selected masks, demonstrating that regions chosen by S2AP are flatter.
Mask Stability: The Hamming distance between masks selected over multiple epochs is reduced, showing decreased sensitivity of mask selection to transient score variations.

Results are reproducible across networks, sparsity regimes, and pruning granularities (unstructured and structured channel pruning).

S2AP draws conceptual lineage from sharpness-aware minimization (SAM) (Wen et al., 2022, Zhang et al., 2024), which operates in weight-space to promote flat minima for improved generalization and robustness. S2AP extends these principles to score-space—the domain of mask selection—where the combinatorial top-k function amplifies sharpness. By analogously regularizing mask selection to avoid adversarially sensitive regions in score-space, S2AP resolves stability issues unique to pruning.

This scheme is distinct from approaches focusing exclusively on latent vulnerability (Madaan et al., 2019), feature-level perturbation (Bair et al., 2023), or Bayesian mask selection. Its plug-in formulation enables integration with score-based adversarial pruning without architectural changes.

6. Practical Considerations and Limitations

S2AP introduces additional computational cost due to adversarial perturbations during score optimization. The layer-wise constraint parameter $\gamma$ is a critical hyperparameter governing the perturbation scale. Tuning $\gamma$ balances sharpness minimization against excessive perturbation; recommended $\gamma$ values are empirically derived per model and sparsity.

While S2AP empirically generalizes across domains and model types, structured pruning (e.g., channel or filter masks) may require domain-specific adaptation of the score and perturbation mapping.

Potential directions for future work include:

Developing efficient single-step approximations for sharpness minimization in score-space.
Integrating S2AP with structured pruning or quantization for broader model types.
Investigating joint optimization over score-space and weight-space perturbations.

7. Conclusion

S2AP introduces score-space sharpness minimization into adversarial pruning, stabilizing mask selection and enhancing adversarial robustness in pruned neural networks. Its min–max score-space optimization can be adopted as a plug-in to standard pruning pipelines and demonstrates tangible improvements in robust accuracy, loss landscape flatness, and mask stability across architectures and sparsity levels (Piras et al., 21 Oct 2025). This data-driven approach underlines the practical significance of controlling sharpness in the combinatorial space of mask selection for reliable deployment of compressed neural models under adversarial threat.