Are Labels Required for Improving Adversarial Robustness? (1905.13725v4)

Published 31 May 2019 in cs.LG, cs.CV, and stat.ML

Abstract: Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that unlabeled data can be a competitive alternative to labeled data for training adversarially robust models. Theoretically, we show that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors. On standard datasets like CIFAR-10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-the-art on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. This demonstrates that our finding extends as well to the more realistic case where unlabeled data is also uncurated, therefore opening a new avenue for improving adversarial training.

Authors (6)

Jonathan Uesato (29 papers)
Jean-Baptiste Alayrac (38 papers)
Po-Sen Huang (30 papers)
Robert Stanforth (18 papers)
Alhussein Fawzi (20 papers)
Pushmeet Kohli (116 papers)

Citations (323)

View on Semantic Scholar

Summary

Analyzing the Role of Labels in Adversarial Robustness Enhancement

The paper explores a pivotal question in the field of machine learning regarding the necessity of labeled data for enhancing adversarial robustness when deploying deep learning models. Understanding this question is critical as it addresses the significant resource constraint associated with acquiring exhaustively labeled datasets for adversarial training. Primarily, the research investigates whether unlabeled data can be as effective as labeled data in training adversarially robust models.

Key Insights and Findings

Data Requirements for Adversarial Training:
- The work begins by acknowledging a challenging discovery: adversarial robustness necessitates substantially greater amounts of data compared to standard classification tasks. This is a significant impediment, particularly in environments where labeled data acquisition is prohibitively expensive.
Hypothesis on Unlabeled Data:
- The central hypothesis posited by the authors is that unlabeled data might suffice to bridge the data gap in adversarial training. The premise here is that since adversarial robustness relies on the smoothness of the classifier around natural inputs, it might be possible to infer such smoothness from unlabeled data.
Proposed Methodologies:
- The authors propose two methods: Unsupervised Adversarial Training with Online Targets (UAT-OT) and Fixed Targets (UAT-FT).
- UAT-OT minimizes a target smoothness loss derived unsupervisedly from the data, while UAT-FT uses initial labeled data to generate pseudo-labels for further adversarial training.
Statistical Theoretical Validation:
- The paper presents a theoretical model adapted from Schmidt et al.'s Gaussian model to demonstrate that a single labeled example can achieve adversarial robustness similar to fully labeled training sets, with the proviso of sufficient unlabeled data.
Empirical Results:
- Evaluation on CIFAR-10 and SVHN datasets shows that models trained with the proposed methods achieve robust accuracies close to those derived from fully supervised adversarial training. Impressively, the UAT-FT method, with adequate unlabeled examples, captured over 95% of the robustness improvement accomplished through additional labels.
- The inclusion of uncurated, large-scale unlabeled data from the 80 Million Tiny Images dataset further substantiated the viability of this approach.
Label Noise Resilience:
- A detailed analysis revealed that UAT approaches, especially UAT-FT and UAT++, are exceptionally robust to label noise, maintaining high robustness even under high levels of label noise. This insight is crucial for practical deployment scenarios where pseudo-label accuracy might be suboptimal.

Implications and Future Directions

The implication of this research is profound. It suggests that labeled data might not be as critical as previously assumed for adversarial robustness—a paradigm shift that could democratize access to robust machine learning models in areas where data labeling is a bottleneck. Moreover, utilizing uncurated, vast unlabeled datasets holds potential for widespread improvements in model robustness without burdensome labeling costs.

Future avenues might focus on refining pseudo-labeling techniques and developing heuristic-based unlabeled data processing to further elevate robustness. The findings also indicate fertile ground for hybrid approaches combining self/semi-supervised learning methodologies with adversarial training to harness the full potential of unlabeled data in enhancing model robustness.

In conclusion, the paper presents a well-founded and empirically validated proposition that significantly advances the understanding and practicality of developing robust machine learning systems in label-scarce environments.

PDF Markdown

Related Papers

Find Related Papers