Alternative Surrogate Loss for PGD
- The paper introduces alternative surrogate losses that diversify PGD’s optimization landscape to enhance attack strength and improve robustness evaluation.
- It details methods like MultiTargeted and alternating objectives, demonstrating measurable improvements in robust accuracy on models such as MNIST and CIFAR-10.
- Practical guidelines emphasize using Adam optimizers, decaying step sizes, and increased restarts to effectively navigate complex loss landscapes during adversarial testing.
Alternative surrogate losses for Projected Gradient Descent (PGD) are loss functions used in place of the standard cross-entropy or margin-based surrogate when constructing adversarial examples within a norm-bounded threat model. These alternative surrogates can enhance the efficacy and generality of PGD attacks and adversarial robustness assessments by modifying the objective landscape explored during optimization. Research in this domain rigorously characterizes the impact of surrogate loss selection, demonstrating both systematic limitations in classical choices and the practical superiority of carefully constructed alternatives in adversarial testing.
1. Background: PGD and Surrogate Losses
PGD is the canonical first-order method to generate adversarial examples by performing projected gradient ascent within a threat region (typically an or ball) to maximize a surrogate loss with respect to the input . Standard choices of include:
- Cross-Entropy (CE):
- Margin (CW):
Here, is the logit for class and is the output probability for class 0.
The surrogate loss function determines the geometry of the optimization landscape accessed by PGD. Empirical and theoretical work demonstrates that certain losses can become suboptimal for complex, non-linear models, especially under high-dimensional input spaces.
2. Limitations of Classical Surrogates
Empirical evidence indicates that no single surrogate loss dominates across defense models and datasets. For instance, robust accuracy under PGD attacks varies by up to 4% between different surrogate losses on CIFAR-10 robust models, depending on which surrogate is chosen (Antoniou et al., 2022). Additionally, convex surrogate losses such as cross-entropy or margin loss may fail to discover adversarial directions that leverage specific model vulnerabilities, especially in locally linear or degenerate regions.
Theoretically, it is shown that optimizing a convex surrogate need not maximize the true adversarial loss unless the loss is aligned with the most vulnerable misclassification direction. This motivates the search for surrogates that more systematically explore the adversarial boundary (Gowal et al., 2019).
3. MultiTargeted Surrogate Loss
The MultiTargeted approach, proposed in "An Alternative Surrogate Loss for PGD-based Adversarial Testing" (Gowal et al., 2019), replaces the single surrogate loss with a set of 1 targeted logit-difference losses:
2
During attack, PGD is run with the surrogate 3 for each possible 4 (class other than 5). This ensures that, for each possible misclassification target, the attack explicitly seeks a perturbation prioritizing that direction.
Theoretical Guarantees:
- For affine (linear) models, the MultiTargeted approach with 6 (one restart per target) is guaranteed to find the true maximum adversarial loss over the threat region [Theorem 3.1, (Gowal et al., 2019)].
- The result extends to locally affine regions for general networks [Theorem 3.2].
Practical Algorithm:
For each target class 7:
- Initialize at a random point in the threat region.
- Run PGD maximizing 8 for 9 steps (preferably with the Adam optimizer).
- Aggregate the best adversarial example across all restarts and targets.
Empirically, combining MultiTargeted and standard PGD achieves lower robust accuracy (i.e., stronger attacks) than untargeted PGD under the same compute budget on benchmark models—e.g., reducing MADRY’s MNIST model accuracy to 88.36% (0) and CIFAR-10 model accuracy to 44.03% (1), both state-of-the-art at time of publication (Gowal et al., 2019).
4. Alternating Objective Surrogates
"Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks" (Antoniou et al., 2022) proposes temporally cycling between distinct standard loss functions during PGD steps. Instead of fixing a single 2, the 3 attack steps are partitioned into 4 stages, each stage using a different surrogate from 5:
- DLR Loss: 6, where 7 are indices of the top three logits.
By switching losses, the adversarial search trajectory is diversified, and the optimizer is less likely to become stuck in regions flat or otherwise poorly aligned in any single loss landscape. Two-stage alternation (8 or 9) consistently improves attack strength, reducing robust accuracy by 0 across standard datasets compared to single-loss baseline PGD; three-stage alternation yields diminished but sometimes additive improvement.
The method matches or outperforms AutoAttack’s white-box ensemble and recent strong baselines (e.g., GAMA-PGD) under matched computational budgets.
Best practices:
- Start with cross-entropy (informative gradients everywhere), then switch to CW or DLR loss.
- Two-stage alternation generally achieves the best trade-off between explore-exploit and budget allocation.
5. Hyperparameter Sensitivity and Optimization
Alternative losses exhibit different sensitivities to step size, optimizer choice, and restart scheduling:
- Optimizer: Adam outperforms sign-gradient and momentum for highly non-linear architectures in both standard and MultiTargeted PGD (Gowal et al., 2019).
- Step Size: Decaying step size schedules reliably escape obfuscated gradients and yield higher attack success with fewer iterations.
- Restarts: MultiTargeted requires cycling through all 1 targets to reach theoretical guarantees. For challenging models, increasing restart count improves attack strength more than further increasing steps per target.
A table summarizing empirical robust accuracy under 2 threat for selected benchmarks is provided:
| Model/Dataset | PGD Accuracy | MultiTargeted Accuracy | PGD + MT Accuracy |
|---|---|---|---|
| MADRY MNIST (3) | 88.21% (1800×) | 88.43% (200×) | 88.36% (200×) |
| MADRY CIFAR-10 (4) | 44.51% (180×) | 44.05% (20×) | 44.03% (20×) |
| TRADES CIFAR-10 (5) | 53.70% (20×) | 53.07% (20×) | 53.07% (20×) |
[Data from (Gowal et al., 2019) Table 1.]
6. Practical Guidelines and Limitations
The choice and scheduling of surrogate losses for PGD adversarial testing substantially affect measured robust accuracy. Systematic alternation and MultiTargeted schemes both outperform classical fixed-surrogate attacks across model architectures and datasets. However, these advances introduce compute and tuning overhead, as each loss or target direction requires additional PGD restarts or stage management.
Alternative surrogate losses have not yet been systematically explored for attacks beyond 6 (e.g., 7) or for integration with black-box or transfer-based methods. Their applicability for fully automated adversarial evaluation (e.g., as in AutoAttack) is a topic for ongoing research (Antoniou et al., 2022).
7. Summary and Impact
Alternative surrogate losses—through techniques such as MultiTargeted testing and alternating objectives—represent a principled evolution in adversarial example generation. By diversifying the PGD optimization landscape and systematically covering adversarial directions, these methods both strengthen attack success and enable more rigorous evaluation of robust machine learning models. Empirical benchmarks and theoretical results confirm their superiority over conventional single-surrogate PGD across a spectrum of tasks and models (Gowal et al., 2019, Antoniou et al., 2022). The refinement of surrogate loss design continues to be integral to the advancement of adversarial robustness evaluation.