Can Biases in ImageNet Models Explain Generalization? (2404.01509v1)
Abstract: The robust generalization of models to rare, in-distribution (ID) samples drawn from the long tail of the training distribution and to out-of-training-distribution (OOD) samples is one of the major challenges of current deep learning methods. For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches. The current understanding of generalization in neural networks is very limited, but some biases that differentiate models from human vision have been identified and might be causing these limitations. Consequently, several attempts with varying success have been made to reduce these biases during training to improve generalization. We take a step back and sanity-check these attempts. Fixing the architecture to the well-established ResNet-50, we perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases - including shape bias, spectral biases, and critical bands - interact with generalization. Our extensive study results reveal that contrary to previous findings, these biases are insufficient to accurately predict the generalization of a model holistically. We provide access to all checkpoints and evaluation code at https://github.com/paulgavrikov/biases_vs_generalization
- Dissecting the high-frequency bias in convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19-25, 2021, pages 863–871. Computer Vision Foundation / IEEE, 2021.
- Are we done with imagenet?, 2020.
- Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases, pages 387–402, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
- Adversarial sensor attack on lidar-based perception in autonomous driving. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, page 2267–2281, New York, NY, USA, 2019. Association for Computing Machinery.
- Unsupervised learning of visual features by contrasting cluster assignments. In Advances in Neural Information Processing Systems, pages 9912–9924. Curran Associates, Inc., 2020.
- Emerging properties in self-supervised vision transformers. In Proceedings of the International Conference on Computer Vision (ICCV), 2021.
- Big self-supervised models are strong semi-supervised learners. In Advances in Neural Information Processing Systems, pages 22243–22255. Curran Associates, Inc., 2020.
- An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9640–9649, 2021.
- When vision transformers outperform resnets without pre-training or strong data augmentations. In International Conference on Learning Representations, 2022.
- Scaling vision transformers to 22 billion parameters. In Proceedings of the 40th International Conference on Machine Learning, pages 7480–7512. PMLR, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- Noisymix: Boosting model robustness to common corruptions, 2022.
- Adversarial attacks on medical machine learning. Science, 363(6433):1287–1289, 2019.
- The power of linear combinations: Learning with random convolutions, 2023.
- An extended study of human-like behavior under adversarial training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2360–2367, 2023.
- Are vision language models texture or shape biased and can we steer them?, 2024.
- Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations, 2019.
- Partial success in closing the gap between human and machine vision. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021.
- Deep residual learning for image recognition, 2015.
- Masked autoencoders are scalable vision learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15979–15988, 2022.
- Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations, 2019.
- AugMix: A simple data processing method to improve robustness and uncertainty. Proceedings of the International Conference on Learning Representations (ICLR), 2020.
- The many faces of robustness: A critical analysis of out-of-distribution generalization. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8320–8329, 2021a.
- Natural adversarial examples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021b.
- Pixmix: Dreamlike pictures comprehensively improve safety measures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16783–16792, 2022.
- The origins and prevalence of texture bias in convolutional neural networks. In Advances in Neural Information Processing Systems, pages 19000–19015. Curran Associates, Inc., 2020.
- Adversarial examples are not bugs, they are features. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
- Shape or texture: Understanding discriminative features in CNNs. In International Conference on Learning Representations, 2021.
- Intriguing properties of generative classifiers. In The Twelfth International Conference on Learning Representations, 2024.
- Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2012.
- Shape-texture debiased neural network training. In International Conference on Learning Representations, 2021.
- A comprehensive study on robustness of image classification models: Benchmarking and rethinking. arXiv preprint arXiv:2302.14301, 2023.
- Improving robustness without sacrificing accuracy with patch gaussian augmentation, 2020.
- Improving native CNN robustness with filter frequency regularization. Transactions on Machine Learning Research, 2023.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
- On interaction between augmentations and corruptions in natural corruption robustness. In Advances in Neural Information Processing Systems, pages 3571–3583. Curran Associates, Inc., 2021.
- Prime: A few primitives can boost robustness to common corruptions. In Computer Vision – ECCV 2022, pages 623–640, Cham, 2022. Springer Nature Switzerland.
- Classification robustness to common optical aberrations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pages 3632–3643, 2023.
- Does enhanced shape bias improve neural network robustness to common corruptions? In International Conference on Learning Representations, 2021.
- Intriguing properties of vision transformers. In Advances in Neural Information Processing Systems, pages 23296–23308. Curran Associates, Inc., 2021.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
- Adversarial training can hurt generalization. In ICML 2019 Workshop on Identifying and Understanding Deep Learning Phenomena, 2019.
- Do ImageNet classifiers generalize to ImageNet? In Proceedings of the 36th International Conference on Machine Learning, pages 5389–5400. PMLR, 2019.
- Daniel L. Ruderman. The statistics of natural images. Network: Computation In Neural Systems, 5:517–548, 1994.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Improving robustness against common corruptions with frequency biased models. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 10191–10200. IEEE, 2021.
- Do adversarially robust imagenet models transfer better? In Advances in Neural Information Processing Systems, pages 3533–3545. Curran Associates, Inc., 2020.
- Informative dropout for robust representation learning: A shape-bias perspective. In Proceedings of the 37th International Conference on Machine Learning, pages 8828–8839. PMLR, 2020.
- Modulation spectra of natural sounds and ethological theories of auditory processing. The Journal of the Acoustical Society of America, 114:3394–411, 2004.
- How to train your vit? data, augmentation, and regularization in vision transformers. Transactions on Machine Learning Research, 2022.
- Disentangling adversarial robustness and generalization. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6969–6980, 2019.
- Spatial-frequency channels, shape bias, and adversarial robustness. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
- Robustart: Benchmarking robustness on architecture design and training techniques, 2021.
- Measuring robustness to natural distribution shifts in image classification. In Advances in Neural Information Processing Systems, pages 18583–18599. Curran Associates, Inc., 2020.
- Yfcc100m: The new data in multimedia research. Commun. ACM, 59(2):64–73, 2016.
- Training data-efficient image transformers and distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, pages 10347–10357. PMLR, 2021.
- Robustness may be at odds with accuracy. In International Conference on Learning Representations, 2019.
- Vasilis Vryniotis. How to Train State-Of-The-Art Models Using TorchVision’s Latest Primitives, 2023. [Online; accessed 15. Nov. 2023].
- Learning robust global representations by penalizing local predictive power. In Advances in Neural Information Processing Systems, pages 10506–10518, 2019.
- High-frequency component helps explain the generalization of convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- What do neural networks learn in image classification? a frequency shortcut perspective. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1433–1442, 2023.
- Real world robustness from systematic noise. In Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia, page 42–48, New York, NY, USA, 2021. Association for Computing Machinery.
- Ross Wightman. Pytorch image models. https://github.com/rwightman/pytorch-image-models, 2019.
- Resnet strikes back: An improved training procedure in timm. In NeurIPS 2021 Workshop on ImageNet: Past, Present, and Future, 2021.
- Smooth adversarial training, 2021.
- Billion-scale semi-supervised learning for image classification, 2019.
- A fourier perspective on model robustness in computer vision. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.