Revisiting Adversarial Training at Scale

Published 9 Jan 2024 in cs.CV | (2401.04727v2)

Abstract: The machine learning community has witnessed a drastic change in the training pipeline, pivoted by those ''foundation models'' with unprecedented scales. However, the field of adversarial training is lagging behind, predominantly centered around small model sizes like ResNet-50, and tiny and low-resolution datasets like CIFAR-10. To bridge this transformation gap, this paper provides a modern re-examination with adversarial training, investigating its potential benefits when applied at scale. Additionally, we introduce an efficient and effective training strategy to enable adversarial training with giant models and web-scale data at an affordable computing cost. We denote this newly introduced framework as AdvXL. Empirical results demonstrate that AdvXL establishes new state-of-the-art robust accuracy records under AutoAttack on ImageNet-1K. For example, by training on DataComp-1B dataset, our AdvXL empowers a vanilla ViT-g model to substantially surpass the previous records of $l_{\infty}$-, $l_{2}$-, and $l_{1}$-robust accuracy by margins of 11.4%, 14.2% and 12.9%, respectively. This achievement posits AdvXL as a pioneering approach, charting a new trajectory for the efficient training of robust visual representations at significantly larger scales. Our code is available at https://github.com/UCSC-VLAA/AdvXL.

Abstract PDF Upgrade to Chat

Authors (4)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces AdvXL, a two-stage training framework that enables adversarial training for large-scale models and extensive datasets.
Empirical results on ImageNet-1K show improvements in adversarial robustness by up to 14.2% over previous state-of-the-art.
AdvXL’s scalable approach opens avenues for robust foundation models with improved generalization against adversarial attacks.

Introduction to Adversarial Training at Scale

The robustness of machine learning models against adversarial attacks is a critical area in AI security. Traditionally, adversarial training, a technique aimed at increasing model resilience, has been constrained by limitations in computational resources and the scale at which it could be employed. This has led to adversarial training being practiced primarily with smaller models on relatively simple datasets.

AdvXL: A New Framework

Researchers at UC Santa Cruz have broken through this barrier by proposing a novel adversarial training framework called AdvXL. AdvXL enables the use of large-scale models and extensive datasets, which was previously infeasible due to computational costs. By adopting an efficient two-stage training strategy, this framework advances the scope of adversarial training to one billion-parameter models and web-scale datasets.

Empirical Results and Improvements

Empirical studies using AdvXL have shown exceptional results. On the ImageNet-1K benchmark, which is a popular dataset for visual recognition tasks, AdvXL has set new records for model robustness to adversarial attacks. The study reports substantial improvements over the previous state-of-the-art records, with margins as high as 11.4%, 14.2%, and 12.9% for various measures of adversarial robustness.

Broader Implications and Future Directions

The success of AdvXL opens new avenues for developing robust machine learning models at a scale akin to foundation models, which are significant for their general applicability and performance. The positive results suggest that models trained using AdvXL exhibit enhanced generalization against unprecedented adversarial attacks, which aligns with the robustness observed in other large-scale foundation models. This makes AdvXL a remarkable step toward achieving adversarial robustness without unmanageably high computational costs.

Markdown Report Issue