Do Wider Neural Networks Really Help Adversarial Robustness?

Published 3 Oct 2020 in cs.LG, cs.AI, and stat.ML | (2010.01279v3)

Abstract: Adversarial training is a powerful type of defense against adversarial examples. Previous empirical results suggest that adversarial training requires wider networks for better performances. However, it remains elusive how neural network width affects model robustness. In this paper, we carefully examine the relationship between network width and model robustness. Specifically, we show that the model robustness is closely related to the tradeoff between natural accuracy and perturbation stability, which is controlled by the robust regularization parameter $\lambda$. With the same $\lambda$, wider networks can achieve better natural accuracy but worse perturbation stability, leading to a potentially worse overall model robustness. To understand the origin of this phenomenon, we further relate the perturbation stability with the network's local Lipschitzness. By leveraging recent results on neural tangent kernels, we theoretically show that wider networks tend to have worse perturbation stability. Our analyses suggest that: 1) the common strategy of first fine-tuning $\lambda$ on small networks and then directly use it for wide model training could lead to deteriorated model robustness; 2) one needs to properly enlarge $\lambda$ to unleash the robustness potential of wider models fully. Finally, we propose a new Width Adjusted Regularization (WAR) method that adaptively enlarges $\lambda$ on wide models and significantly saves the tuning time.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (88)

View on Semantic Scholar

Summary

The paper shows that while wider networks improve natural accuracy, they compromise adversarial robustness due to heightened local Lipschitz sensitivity.
It employs theoretical analysis via neural tangent kernels to link increased network width with deteriorated perturbation stability.
The study introduces Width Adjusted Regularization (WAR), a dynamic tuning method to mitigate robustness issues in wider architectures.

Do Wider Neural Networks Really Help Adversarial Robustness?

The study presented in "Do Wider Neural Networks Really Help Adversarial Robustness?" by Wu et al. critically examines the relationship between neural network width and adversarial robustness, particularly within the context of adversarial training—a prevalent defense mechanism against adversarial examples. The prevailing intuition suggests that wider networks might inherently contribute to enhanced model robustness, yet this paper provides a comprehensive analysis challenging this belief.

Framework and Analysis

Adversarial training seeks to bolster robustness by introducing adversarial examples during the training process, aiming to minimize both the natural risk and a robust regularization component. The robust regularization term is modulated by a parameter $\lambda$ , which controls the trade-off between natural accuracy—how well the model performs on clean data—and perturbation stability, essentially the network's resistance to small input perturbations.

The authors provide insight into why the naive assumption regarding network width might be flawed. They demonstrate through theoretical analysis and empirical validation that while wider networks generally yield higher natural accuracy, they often exhibit poorer perturbation stability, leading to compromised overall robustness. This counterintuitive phenomenon is attributed to the network's local Lipschitzness, which is exacerbated in wider networks, making them more sensitive to adversarial perturbations.

Key Contributions

Perturbation Stability: The paper introduces a metric, perturbation stability, which captures the model's ability to maintain consistent outputs under input perturbations. It reveals that wider networks inherently struggle with this aspect, contrary to previous assumptions that network width uniformly enhances robustness.
Theoretical Insights: Leveraging the latest results in neural tangent kernels, the work presents a theoretical foundation that links increased network width with deteriorating perturbation stability due to heightened local Lipschitz constants.
Heightened Regularization Needs: They propose that the robust regularization parameter $\lambda$ must be carefully tuned, rather than directly transferred from smaller to larger architectures. Properly increasing $\lambda$ for wider networks can mitigate their natural weakness in perturbation stability and unlock their robustness potential.
Width Adjusted Regularization (WAR): The paper introduces WAR, a method to dynamically adjust $\lambda$ during training on wider networks, greatly reducing the computational overhead traditionally required for manual parameter tuning.

Implications and Future Directions

Practically, this research reshapes the approach to adversarial training optimization by encouraging a more nuanced understanding of the interplay between model capacity and regularization strategies. Theoretically, it underscores the need for further exploration of how structural aspects of neural architectures impact robustness, possibly steering future developments in AI towards architecturally aware learning paradigms.

The findings also open several avenues for future investigation, such as improving empirical methods for approximating local Lipschitzness in more complex architectures or developing alternative regularization techniques that can inherently balance the trade-offs discussed.

In conclusion, the insights from this paper challenge standardized practices in adversarial training and present a more sophisticated understanding of how network width and regularization interact to affect adversarial robustness. This study is pivotal for both theorists and practitioners seeking to build more resilient AI systems in the face of adversarial perturbations.

Markdown Report Issue