Robust NAS under adversarial training: benchmark, theory, and beyond (2403.13134v1)
Abstract: Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robust architectures, especially when adversarial training is considered. In this work, we aim to address these two challenges, making twofold contributions. First, we release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks from the NAS-Bench-201 search space on image datasets. Then, leveraging the neural tangent kernel (NTK) tool from deep learning theory, we establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training. We firmly believe that our benchmark and theoretical insights will significantly benefit the NAS community through reliable reproducibility, efficient assessment, and theoretical foundation, particularly in the pursuit of robust architectures.
- A convergence theory for deep learning via over-parameterization. In International Conference on Machine Learning (ICML), 2019.
- Square attack: a query-efficient black-box adversarial attack via random search. In European Conference on Computer Vision (ECCV), pp. 484–501. Springer, 2020.
- Harnessing the power of infinitely wide deep nets on small-data tasks. In International Conference on Learning Representations (ICLR), 2020.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning (ICML), 2018.
- Designing neural network architectures using reinforcement learning. In International Conference on Learning Representations (ICLR), 2017.
- Jahs-bench-201: A foundation for research on joint architecture and hyperparameter search. In Advances in neural information processing systems (NeurIPS), 2022a.
- JAHS-bench-201: A foundation for research on joint architecture and hyperparameter search. In Advances in neural information processing systems (NeurIPS), 2022b.
- On the inductive bias of neural tangent kernels. In Advances in neural information processing systems (NeurIPS), 2019.
- Language models are few-shot learners. Advances in neural information processing systems (NeurIPS), 2020.
- Generalization bounds of stochastic gradient descent for wide and deep neural networks. Advances in neural information processing systems (NeurIPS), 2019.
- Towards understanding the spectral bias of deep learning. arXiv preprint arXiv:1912.01198, 2019.
- Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Ieee, 2017.
- Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. In International Conference on Learning Representations (ICLR), 2021.
- On lazy training in differentiable programming. Advances in neural information processing systems (NeurIPS), 2019.
- The spectral bias of polynomial neural networks. In International Conference on Learning Representations (ICLR), 2022.
- A downsampled variant of imagenet as an alternative to the cifar datasets, 2017.
- Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning (ICML), 2017.
- Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine Learning (ICML), 2020a.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning (ICML), 2020b.
- Make some noise: Reliable and efficient single-step adversarial training. In Advances in neural information processing systems (NeurIPS), 2022.
- Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021.
- Gradient descent finds global minima of deep neural networks. In International Conference on Machine Learning (ICML), 2019a.
- Graph neural tangent kernel: Fusing graph neural networks with graph kernels. Advances in neural information processing systems (NeurIPS), 2019b.
- Gradient descent provably optimizes over-parameterized neural networks. In International Conference on Learning Representations (ICLR), 2019c.
- Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. In Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Convergence of adversarial training in overparametrized neural networks. In Advances in neural information processing systems (NeurIPS), 2019.
- Mora: Improving ensemble robustness evaluation with model reweighing attack. Advances in Neural Information Processing Systems, 2022.
- Matrix Computations (3rd Ed.). 1996.
- Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR), 2015.
- When nas meets robustness: In search of robust architectures against adversarial attacks. In Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 588–597, 2019.
- Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations (ICLR), 2019.
- Nas-hpo-bench-ii: A benchmark dataset on joint optimization of convolutional neural network architecture and training hyperparameters. In Asian Conference on Machine Learning, 2021.
- Dsrna: Differentiable search of robust neural architectures. In Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6196–6205, 2021.
- Exploring architectural ingredients of adversarially robust deep neural networks. In Advances in neural information processing systems (NeurIPS), 2021.
- Why do deep residual networks generalize better than deep feedforward networks?—a neural tangent kernel perspective. Advances in neural information processing systems (NeurIPS), 2020.
- Revisiting residual networks for adversarial robustness: An architectural perspective. arXiv preprint arXiv:2212.11005, 2022.
- Neural tangent kernel: Convergence and generalization in neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in neural information processing systems (NeurIPS), 2018.
- Neural architecture design and robustness: A dataset. In International Conference on Learning Representations (ICLR), 2023.
- Nas-bench-nlp: neural architecture search benchmark for natural language processing. IEEE Access, 10:45736–45747, 2022.
- Learning multiple layers of features from tiny images. In Technical report, 2009.
- Dnr: A tunable robust pruning framework through dynamic network rewiring of dnns. In Proceedings of the 26th Asia and South Pacific Design Automation Conference, pp. 344–350, 2021.
- Adversarial examples in the physical world. In Artificial intelligence safety and security, 2018.
- Wide neural networks of any depth evolve as linear models under gradient descent. In Advances in neural information processing systems (NeurIPS), 2019.
- Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence, 2020.
- DARTS: Differentiable architecture search. In International Conference on Learning Representations (ICLR), 2019.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR), 2018.
- {NAS}-bench-{asr}: Reproducible neural architecture search for speech recognition. In International Conference on Learning Representations (ICLR), 2021. URL https://openreview.net/forum?id=CU0APx9LMaL.
- Neural architecture search without training. In International Conference on Machine Learning (ICML), 2021.
- Advrush: Searching for adversarially robust neural architectures. In International Conference on Computer Vision (ICCV), 2021.
- Demystifying the neural tangent kernel from a practical perspective: Can it be trusted for neural architecture search without training? In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Global convergence of deep networks with one wide layer followed by pyramidal topology. In Advances in neural information processing systems (NeurIPS), 2020.
- Generalization guarantees for neural architecture search with train-validation split. In International Conference on Machine Learning, pp. 8291–8301, 2021.
- Simple, fast, and flexible framework for matrix completion with infinite width neural networks. Proceedings of the National Academy of Sciences, 119(16):e2115064119, 2022.
- Large-scale evolution of image classifiers. In International Conference on Machine Learning (ICML), 2017.
- Regularized evolution for image classifier architecture search. In AAAI Conference on Artificial Intelligence, 2019.
- Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pp. 8093–8104. PMLR, 2020.
- Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In AAAI Conference on Artificial Intelligence, 2018.
- Nasi: Label-and data-agnostic neural architecture search at initialization. In International Conference on Learning Representations (ICLR), 2022.
- Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications, volume 11006, pp. 369–386. SPIE, 2019.
- A genetic programming approach to designing convolutional neural network architectures. In Proceedings of the genetic and evolutionary computation conference, 2017.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
- Nas-bench-360: Benchmarking neural architecture search on diverse tasks. In Advances in neural information processing systems (NeurIPS), volume 35, pp. 12380–12394, 2022.
- Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices, pp. 210–268. Cambridge University Press, 2012. doi: 10.1017/CBO9780511794308.006.
- Rethinking architecture selection in differentiable nas. In International Conference on Learning Representations (ICLR), 2021.
- On the convergence of certified robust training with interval bound propagation. In International Conference on Learning Representations (ICLR), 2022.
- Exploring the loss landscape in neural architecture search. In Uncertainty in Artificial Intelligence, 2021.
- Neural architecture search: Insights from 1000 papers. arXiv preprint arXiv:2301.08727, 2023.
- Enhancing adversarial defense by k-winners-take-all. In International Conference on Learning Representations (ICLR), 2020.
- Feature denoising for improving adversarial robustness. In Conference on Computer Vision and Pattern Recognition (CVPR), pp. 501–509, 2019.
- Knas: green neural architecture search. In International Conference on Machine Learning (ICML), 2021.
- Feature squeezing: Detecting adversarial examples in deep neural networks. Network and Distributed Systems Security Symposium, 2018.
- Feature learning in infinite-width neural networks. In International Conference on Machine Learning (ICML), 2021.
- b-darts: Beta-decay regularization for differentiable architecture search. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning (ICML), 2019.
- Understanding and robustifying differentiable architecture search. In International Conference on Learning Representations (ICLR), 2020.
- Surrogate nas benchmarks: Going beyond the limited search spaces of tabular nas benchmarks. In International Conference on Learning Representations (ICLR), 2022.
- Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning (ICML), 2019.
- Over-parameterized adversarial training: An analysis overcoming the curse of dimensionality. In Advances in neural information processing systems (NeurIPS), 2020.
- Generalization properties of NAS under activation and skip connection search. In Advances in neural information processing systems (NeurIPS), 2022a.
- Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization). In Advances in neural information processing systems (NeurIPS), 2022b.
- Neural architecture search with reinforcement learning. In International Conference on Learning Representations (ICLR), 2017.
- Learning transferable architectures for scalable image recognition. In Conference on Computer Vision and Pattern Recognition (CVPR), 2018.