Fragility, Robustness and Antifragility in Deep Learning (2312.09821v2)
Abstract: We propose a systematic analysis of deep neural networks (DNNs) based on a signal processing technique for network parameter removal, in the form of synaptic filters that identifies the fragility, robustness and antifragility characteristics of DNN parameters. Our proposed analysis investigates if the DNN performance is impacted negatively, invariantly, or positively on both clean and adversarially perturbed test datasets when the DNN undergoes synaptic filtering. We define three \textit{filtering scores} for quantifying the fragility, robustness and antifragility characteristics of DNN parameters based on the performances for (i) clean dataset, (ii) adversarial dataset, and (iii) the difference in performances of clean and adversarial datasets. We validate the proposed systematic analysis on ResNet-18, ResNet-50, SqueezeNet-v1.1 and ShuffleNet V2 x1.0 network architectures for MNIST, CIFAR10 and Tiny ImageNet datasets. The filtering scores, for a given network architecture, identify network parameters that are invariant in characteristics across different datasets over learning epochs. Vice-versa, for a given dataset, the filtering scores identify the parameters that are invariant in characteristics across different network architectures. We show that our synaptic filtering method improves the test accuracy of ResNet and ShuffleNet models on adversarial datasets when only the robust and antifragile parameters are selectively retrained at any given epoch, thus demonstrating applications of the proposed strategy in improving model robustness.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
- W. Samek, G. Montavon et al., “Explaining deep neural networks and beyond: A review of methods and applications,” Proc IEEE, vol. 109, no. 3, pp. 247–278, 2021.
- N. Papernot, P. McDaniel et al., “The limitations of deep learning in adversarial settings,” in EuroS&P, 2016, pp. 372–387.
- N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in Proc IEEE Symp Secur Priv (SP), 2017.
- N. Srivastava, G. Hinton et al., “Dropout: a simple way to prevent neural networks from overfitting,” JMLR, vol. 15, no. 1, pp. 1929–1958, 2014.
- R. Yu, A. Li, C.-F. Chen, J.-H. Lai, V. I. Morariu, X. Han, M. Gao, C.-Y. Lin, and L. S. Davis, “Nisp: Pruning networks using neuron importance score propagation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Z. Mariet and S. Sra, “Diversity networks: Neural network compression using determinantal point processes,” arXiv preprint arXiv:1511.05077, 2015.
- B. S. Oken, I. Chamine, and W. Wakeland, “A systems approach to stress, stressors and resilience in humans,” Behavioural brain research, vol. 282, pp. 144–154, 2015.
- C. Szegedy, W. Zaremba et al., “Intriguing properties of neural networks,” in ICLR, 2014.
- B. Biggio, I. Corona et al., “Evasion attacks against machine learning at test time,” in ECML PKDD, 2013.
- N. N. Taleb and R. Douady, “Mathematical definition, mapping, and detection of (anti) fragility,” Quantitative Finance, vol. 13, no. 11, pp. 1677–1689, 2013.
- T. Freiesleben, “The intriguing relation between counterfactual explanations and adversarial examples,” Minds and Machines, vol. 32, no. 1, pp. 77–109, 2022.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in ICLR, 2015.
- S. Huang, N. Papernot et al., “Adversarial attacks on neural network policies,” in ICLR, 2017.
- K. He, X. Zhang et al., “Deep residual learning for image recognition,” in CVPR, 2016.
- F. N. Iandola, S. Han et al., “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <<<0.5mb model size,” arXiv:1602.07360v4, 2016.
- N. Ma, X. Zhang et al., “ShuffleNet V2: Practical guidelines for efficient CNN architecture design,” in ECCV, 2018.
- Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010, http://yann.lecun.com/exdb/mnist/.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009. [Online]. Available: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
- Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” 2015, stanford CS 231N.
- I. N. Karatsoreos and B. S. McEwen, “Psychobiological allostasis: resistance, resilience and vulnerability,” Trends in cognitive sciences, vol. 15, no. 12, pp. 576–584, 2011.
- V. Ramanujan, M. Wortsman, A. Kembhavi, A. Farhadi, and M. Rastegari, “What’s hidden in a randomly weighted neural network?” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 893–11 902.
- P. Molchanov, A. Mallya, S. Tyree, I. Frosio, and J. Kautz, “Importance estimation for neural network pruning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- X. Gao, R. K. Saha, M. R. Prasad, and A. Roychoudhury, “Fuzz testing based data augmentation to improve robustness of deep neural networks,” in 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 2020, pp. 1147–1158.
- Y. Wang, S. Wu et al., “Demiguise attack: Crafting invisible semantic adversarial perturbations with perceptual similarity,” in IJCAI, 2021.
- H. Xu, Y. Ma et al., “Adversarial attacks and defenses in images, graphs and text: A review,” Int. J. of Autom. and Comput., vol. 17, no. 2, pp. 151–178, 2020.
- D. Tsipras, S. Santurkar et al., “Robustness may be at odds with accuracy,” in ICLR, 2019.
- N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” IEEE Access, vol. 6, pp. 14 410–14 430, 2018.
- P. Samangouei, M. Kabkab, and R. Chellappa, “Defense-GAN: Protecting classifiers against adversarial attacks using generative models,” in ICLR, 2018.
- X. Yuan, P. He et al., “Adversarial examples: Attacks and defenses for deep learning,” IEEE Trans Neural Netw Learn Syst, vol. 30, no. 9, pp. 2805–2824, 2019.
- S. Han, J. Pool et al., “Learning both weights and connections for efficient neural networks,” in NIPS, 2015.
- K. A. Sankararaman, S. De et al., “The impact of neural network overparameterization on gradient confusion and stochastic gradient descent,” in ICML, 2020.
- S. Kornblith, M. Norouzi et al., “Similarity of neural network representations revisited,” in ICML, 2019.
- P. Nakkiran, G. Kaplun, Y. Bansal, T. Yang, B. Barak, and I. Sutskever, “Deep double descent: Where bigger models and more data hurt,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124003, 2021.
- V. Ojha and G. Nicosia, “Backpropagation neural tree,” Neural Networks, vol. 149, pp. 66–83, 2022.
- A. Ilyas, S. Santurkar et al., “Adversarial examples are not bugs, they are features,” in NIPS, 2019.
- C. Pravin, I. Martino, G. Nicosia, and V. Ojha, “Adversarial robustness in deep learning: attacks on fragile neurons,” in International Conference on Artificial Neural Networks. Springer, 2021, pp. 16–28.
- D. Blalock, J. J. Gonzalez Ortiz, J. Frankle, and J. Guttag, “What is the state of neural network pruning?” in Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze, Eds., vol. 2, 2020, pp. 129–146.
- R. Taylor, V. Ojha, I. Martino, and G. Nicosia, “Sensitivity analysis for deep learning: ranking hyper-parameter influence,” in 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2021, pp. 512–516.
- A. S. Rakin, Z. He, L. Yang, Y. Wang, L. Wang, and D. Fan, “Robust sparse regularization: Simultaneously optimizing neural network robustness and compactness,” 2019.
- S. Ye, K. Xu, S. Liu, H. Cheng, J.-H. Lambrechts, H. Zhang, A. Zhou, K. Ma, Y. Wang, and X. Lin, “Adversarial robustness vs. model compression, or both?” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
- A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” in ICLR, 2017.
- J. Wang, “Adversarial examples in physical world,” in IJCAI, 2021.
- J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” 2018.
- K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in ICLR, 2015.
- K. He, X. Zhang et al., “Delving deep into rectifiers: Surpassing human-levelperformance on ImageNet classification,” in ICCV, 2015.
- S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37. PMLR, 2015, pp. 448–456.
- F. Branchaud-Charron, A. Achkar, and P.-M. Jodoin, “Spectral metric for dataset complexity assessment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.