Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 73 tok/s

Gemini 2.5 Pro 55 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 29 tok/s Pro

GPT-4o 95 tok/s Pro

Kimi K2 202 tok/s Pro

GPT OSS 120B 455 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

Explicit Tradeoffs between Adversarial and Natural Distributional Robustness (2209.07592v1)

Published 15 Sep 2022 in cs.LG and cs.CV

Abstract: Several existing works study either adversarial or natural distributional robustness of deep neural networks separately. In practice, however, models need to enjoy both types of robustness to ensure reliability. In this work, we bridge this gap and show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness. We first consider a simple linear regression setting on Gaussian data with disjoint sets of core and spurious features. In this setting, through theoretical and empirical analysis, we show that (i) adversarial training with $\ell_1$ and $\ell_2$ norms increases the model reliance on spurious features; (ii) For $\ell_\infty$ adversarial training, spurious reliance only occurs when the scale of the spurious features is larger than that of the core features; (iii) adversarial training can have an unintended consequence in reducing distributional robustness, specifically when spurious correlations are changed in the new test domain. Next, we present extensive empirical evidence, using a test suite of twenty adversarially trained models evaluated on five benchmark datasets (ObjectNet, RIVAL10, Salient ImageNet-1M, ImageNet-9, Waterbirds), that adversarially trained classifiers rely on backgrounds more than their standardly trained counterparts, validating our theoretical results. We also show that spurious correlations in training data (when preserved in the test domain) can improve adversarial robustness, revealing that previous claims that adversarial vulnerability is rooted in spurious correlations are incomplete.

Citations (20)

View on Semantic Scholar

Summary

The paper demonstrates that adversarial training increases model reliance on spurious features, thereby compromising natural distributional robustness.
A theoretical analysis using linear regression with Gaussian data shows that ℓ1/ℓ2 norms heighten non-core feature reliance while ℓ∞ depends on feature scale.
Empirical studies across benchmarks such as RIVAL10 and ImageNet variants confirm that adversarial training shifts model sensitivity, reducing performance under distribution shifts.

Explicit Tradeoffs between Adversarial and Natural Distributional Robustness

Introduction

The paper examines the complex relationship between adversarial robustness and natural distributional robustness in deep neural networks, focusing on how adversarial training influences model reliance on spurious features. This work presents a comprehensive theoretical and empirical analysis showing that adversarial training can increase a model's dependence on spurious features, impacting its robustness to distribution shifts.

Theoretical Foundations

The paper begins with a theoretical investigation using a linear regression model with Gaussian data comprising core and spurious features. It is shown that adversarial training with $\ell_1$ and $\ell_2$ norms increases reliance on spurious features, while reliance occurs with $\ell_\infty$ only if spurious features have a larger scale. The adversarial loss encourages models to distribute reliance across more features, thereby using spurious ones to mitigate attacks.

Empirical Evidence

Extensive experiments are conducted on datasets like RIVAL10, Salient ImageNet-1M, ImageNet-9, Waterbirds, and ObjectNet to evaluate the implications of adversarial training on spurious feature reliance.

Figure 1: Snapshot of empirical evidence using {\it RIVAL10, Salient ImageNet-1M, ImageNet-9, Waterbirds,} and {\it ObjectNet} benchmarks.

These experiments validate theoretical predictions, demonstrating that adversarially trained models exhibit greater sensitivity to spurious features across various settings.

Impact on Distributional Robustness

The paper highlights a critical trade-off: while adversarial training enhances adversarial robustness, it can reduce the model's ability to generalize under distribution shifts where spurious correlations are altered.

Figure 2: OOD accuracy vs standard ImageNet accuracy for adversarially trained ResNets.

This finding illustrates that models trained with adversarial techniques are more susceptible when natural contexts, such as backgrounds, are changed, diverging from expectations set by standard benchmarks.

Sensitivity Evaluation

The paper further investigates model sensitivity to core versus spurious features using noise-based metrics like RFS and RCS, showing that adversarial training shifts sensitivity away from core features.

Figure 3: Noise-based evaluation of model sensitivity to foreground (RFS on RIVAL10) or core (RCS on Salient ImageNet-1M) regions.

This suggests that adversarial training can unintentionally prioritize robustness against adversarial features at the cost of natural feature sensitivity.

Real-World Implications

The findings emphasize that while adversarial training can secure models against crafted inputs, it necessitates caution due to its potential to weaken robustness against genuine distribution shifts. This balancing act presents new challenges for deploying models in dynamically evolving real-world environments.

Conclusion

This research critically examines the dual axes of adversarial and natural distribution robustness, revealing intricate trade-offs induced by adversarial training. The insights call for comprehensive evaluation strategies encompassing all robustness aspects before deploying AI models in sensitive applications. Future work should explore strategies for balancing these competing robustness objectives effectively.