Unlabeled Data Improves Adversarial Robustness (1905.13736v4)

Published 31 May 2019 in stat.ML, cs.CV, and cs.LG

Abstract: We demonstrate, theoretically and empirically, that adversarial robustness can significantly benefit from semisupervised learning. Theoretically, we revisit the simple Gaussian model of Schmidt et al. that shows a sample complexity gap between standard and robust classification. We prove that unlabeled data bridges this gap: a simple semisupervised learning procedure (self-training) achieves high robust accuracy using the same number of labels required for achieving high standard accuracy. Empirically, we augment CIFAR-10 with 500K unlabeled images sourced from 80 Million Tiny Images and use robust self-training to outperform state-of-the-art robust accuracies by over 5 points in (i) $\ell_\infty$ robustness against several strong attacks via adversarial training and (ii) certified $\ell_2$ and $\ell_\infty$ robustness via randomized smoothing. On SVHN, adding the dataset's own extra training set with the labels removed provides gains of 4 to 10 points, within 1 point of the gain from using the extra labels.

Authors (5)

Yair Carmon (45 papers)
Aditi Raghunathan (56 papers)
Ludwig Schmidt (80 papers)
Percy Liang (239 papers)
John C. Duchi (50 papers)

Citations (712)

View on Semantic Scholar

Summary

Unlabeled Data Improves Adversarial Robustness

The paper "Unlabeled Data Improves Adversarial Robustness" by Carmon et al. addresses the critical issue of improving the adversarial robustness of machine learning models using semi-supervised learning methods. The researchers propose that leveraging unlabeled data can significantly enhance the adversarial robustness of classifiers, both theoretically and empirically.

Theoretical Insights

The theoretical part of the paper revisits the Gaussian model of Schmidt et al. (2018), which demonstrates a sample complexity gap between standard and adversarially robust classification. Schmidt et al. showed that attaining non-trivial adversarial robustness requires significantly more samples compared to achieving standard accuracy. This paper proves that unlabeled data can bridge this gap effectively. Specifically, a semi-supervised learning procedure based on self-training approaches the performance of fully-supervised learning with the same sample complexity as required for standard non-robust accuracy.

The authors provide a detailed theoretical framework showing that in a high-dimensional Gaussian setting, self-training with a combination of labeled and unlabeled data achieves high robust accuracy. They demonstrate that even in the presence of irrelevant data, provided a fraction of the unlabeled data is relevant, significant robustness gains can still be achieved. They quantify that the sample complexity increases inversely with the square of the relevant fraction of the unlabeled data, implying practical scenarios where the majority of the data can be noisy or irrelevant, yet the model remains robustly trained.

Empirical Results

Empirically, the paper presents robust self-training (RST) methods applied on standard image classification benchmarks like CIFAR-10 and SVHN, showing significant improvements in adversarial robustness. The authors augment CIFAR-10 with 500K unlabeled images from the 80 Million Tiny Images dataset and SVHN with its own unlabeled extra training set.

For heuristic $\ell_\infty$ robustness, they use adversarial training based on the TRADES framework, and for certified robustness, they use randomized smoothing. On CIFAR-10, RST outperforms state-of-the-art models by over 5% in robust accuracy, reaching 62.5% under strong attacks. On SVHN, adding pseudo-labeled data improves robust accuracy by 4-10 percentage points, nearly matching the gains from using true labels.

Practical Implications and Future Directions

The research implicates several significant advancements in AI:

Improved Adversarial Training: By incorporating unlabeled data, adversarial training can reach higher levels of robust accuracy, making models more reliable in real-world applications where adversaries might attempt to exploit vulnerabilities.
Efficiency in Label Usage: Utilizing unlabeled data reduces the dependency on extensive labeled datasets, which are often expensive and time-consuming to gather. This efficiency can democratize the development of robust machine learning models by lowering the barrier to access high-quality labeled data.
Robustness in Various Conditions: The ability of models to handle irrelevant and noisy data without significant drops in robustness suggests robustness in diverse operational environments where data quality and relevance might vary.

For future research, the implications of this work suggest exploring:

Scaling Methods to Larger Datasets: Assessing the effectiveness of robust self-training with larger unlabeled datasets and more complex model architectures.
Other Domains and Data Types: Evaluating these methods in domains beyond image classification, such as natural language processing or time-series data, to understand the generalizability of the approach.
Hybrid Training Approaches: Investigating combinations of different semi-supervised learning techniques with robust self-training to further enhance adversarial robustness.

Conclusion

The paper by Carmon et al. makes substantial contributions to understanding and improving adversarial robustness through semi-supervised learning. Their theoretical and empirical findings provide a solid foundation for leveraging unlabeled data in enhancing the security and reliability of machine learning models. These advances open up new avenues for both practical implementations in AI systems and further academic exploration into supercharging robustness through innovative data utilization strategies.

PDF Markdown