Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Biases in ImageNet Models Explain Generalization? (2404.01509v1)

Published 1 Apr 2024 in cs.CV, cs.AI, cs.LG, and stat.ML

Abstract: The robust generalization of models to rare, in-distribution (ID) samples drawn from the long tail of the training distribution and to out-of-training-distribution (OOD) samples is one of the major challenges of current deep learning methods. For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches. The current understanding of generalization in neural networks is very limited, but some biases that differentiate models from human vision have been identified and might be causing these limitations. Consequently, several attempts with varying success have been made to reduce these biases during training to improve generalization. We take a step back and sanity-check these attempts. Fixing the architecture to the well-established ResNet-50, we perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases - including shape bias, spectral biases, and critical bands - interact with generalization. Our extensive study results reveal that contrary to previous findings, these biases are insufficient to accurately predict the generalization of a model holistically. We provide access to all checkpoints and evaluation code at https://github.com/paulgavrikov/biases_vs_generalization

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Dissecting the high-frequency bias in convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19-25, 2021, pages 863–871. Computer Vision Foundation / IEEE, 2021.
  2. Are we done with imagenet?, 2020.
  3. Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases, pages 387–402, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
  4. Adversarial sensor attack on lidar-based perception in autonomous driving. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, page 2267–2281, New York, NY, USA, 2019. Association for Computing Machinery.
  5. Unsupervised learning of visual features by contrasting cluster assignments. In Advances in Neural Information Processing Systems, pages 9912–9924. Curran Associates, Inc., 2020.
  6. Emerging properties in self-supervised vision transformers. In Proceedings of the International Conference on Computer Vision (ICCV), 2021.
  7. Big self-supervised models are strong semi-supervised learners. In Advances in Neural Information Processing Systems, pages 22243–22255. Curran Associates, Inc., 2020.
  8. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9640–9649, 2021.
  9. When vision transformers outperform resnets without pre-training or strong data augmentations. In International Conference on Learning Representations, 2022.
  10. Scaling vision transformers to 22 billion parameters. In Proceedings of the 40th International Conference on Machine Learning, pages 7480–7512. PMLR, 2023.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  12. Noisymix: Boosting model robustness to common corruptions, 2022.
  13. Adversarial attacks on medical machine learning. Science, 363(6433):1287–1289, 2019.
  14. The power of linear combinations: Learning with random convolutions, 2023.
  15. An extended study of human-like behavior under adversarial training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2360–2367, 2023.
  16. Are vision language models texture or shape biased and can we steer them?, 2024.
  17. Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations, 2019.
  18. Partial success in closing the gap between human and machine vision. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021.
  19. Deep residual learning for image recognition, 2015.
  20. Masked autoencoders are scalable vision learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15979–15988, 2022.
  21. Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations, 2019.
  22. AugMix: A simple data processing method to improve robustness and uncertainty. Proceedings of the International Conference on Learning Representations (ICLR), 2020.
  23. The many faces of robustness: A critical analysis of out-of-distribution generalization. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8320–8329, 2021a.
  24. Natural adversarial examples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021b.
  25. Pixmix: Dreamlike pictures comprehensively improve safety measures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16783–16792, 2022.
  26. The origins and prevalence of texture bias in convolutional neural networks. In Advances in Neural Information Processing Systems, pages 19000–19015. Curran Associates, Inc., 2020.
  27. Adversarial examples are not bugs, they are features. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
  28. Shape or texture: Understanding discriminative features in CNNs. In International Conference on Learning Representations, 2021.
  29. Intriguing properties of generative classifiers. In The Twelfth International Conference on Learning Representations, 2024.
  30. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2012.
  31. Shape-texture debiased neural network training. In International Conference on Learning Representations, 2021.
  32. A comprehensive study on robustness of image classification models: Benchmarking and rethinking. arXiv preprint arXiv:2302.14301, 2023.
  33. Improving robustness without sacrificing accuracy with patch gaussian augmentation, 2020.
  34. Improving native CNN robustness with filter frequency regularization. Transactions on Machine Learning Research, 2023.
  35. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
  36. On interaction between augmentations and corruptions in natural corruption robustness. In Advances in Neural Information Processing Systems, pages 3571–3583. Curran Associates, Inc., 2021.
  37. Prime: A few primitives can boost robustness to common corruptions. In Computer Vision – ECCV 2022, pages 623–640, Cham, 2022. Springer Nature Switzerland.
  38. Classification robustness to common optical aberrations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pages 3632–3643, 2023.
  39. Does enhanced shape bias improve neural network robustness to common corruptions? In International Conference on Learning Representations, 2021.
  40. Intriguing properties of vision transformers. In Advances in Neural Information Processing Systems, pages 23296–23308. Curran Associates, Inc., 2021.
  41. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  42. Adversarial training can hurt generalization. In ICML 2019 Workshop on Identifying and Understanding Deep Learning Phenomena, 2019.
  43. Do ImageNet classifiers generalize to ImageNet? In Proceedings of the 36th International Conference on Machine Learning, pages 5389–5400. PMLR, 2019.
  44. Daniel L. Ruderman. The statistics of natural images. Network: Computation In Neural Systems, 5:517–548, 1994.
  45. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
  46. Improving robustness against common corruptions with frequency biased models. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 10191–10200. IEEE, 2021.
  47. Do adversarially robust imagenet models transfer better? In Advances in Neural Information Processing Systems, pages 3533–3545. Curran Associates, Inc., 2020.
  48. Informative dropout for robust representation learning: A shape-bias perspective. In Proceedings of the 37th International Conference on Machine Learning, pages 8828–8839. PMLR, 2020.
  49. Modulation spectra of natural sounds and ethological theories of auditory processing. The Journal of the Acoustical Society of America, 114:3394–411, 2004.
  50. How to train your vit? data, augmentation, and regularization in vision transformers. Transactions on Machine Learning Research, 2022.
  51. Disentangling adversarial robustness and generalization. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6969–6980, 2019.
  52. Spatial-frequency channels, shape bias, and adversarial robustness. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  53. Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
  54. Robustart: Benchmarking robustness on architecture design and training techniques, 2021.
  55. Measuring robustness to natural distribution shifts in image classification. In Advances in Neural Information Processing Systems, pages 18583–18599. Curran Associates, Inc., 2020.
  56. Yfcc100m: The new data in multimedia research. Commun. ACM, 59(2):64–73, 2016.
  57. Training data-efficient image transformers and distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, pages 10347–10357. PMLR, 2021.
  58. Robustness may be at odds with accuracy. In International Conference on Learning Representations, 2019.
  59. Vasilis Vryniotis. How to Train State-Of-The-Art Models Using TorchVision’s Latest Primitives, 2023. [Online; accessed 15. Nov. 2023].
  60. Learning robust global representations by penalizing local predictive power. In Advances in Neural Information Processing Systems, pages 10506–10518, 2019.
  61. High-frequency component helps explain the generalization of convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  62. What do neural networks learn in image classification? a frequency shortcut perspective. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1433–1442, 2023.
  63. Real world robustness from systematic noise. In Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia, page 42–48, New York, NY, USA, 2021. Association for Computing Machinery.
  64. Ross Wightman. Pytorch image models. https://github.com/rwightman/pytorch-image-models, 2019.
  65. Resnet strikes back: An improved training procedure in timm. In NeurIPS 2021 Workshop on ImageNet: Past, Present, and Future, 2021.
  66. Smooth adversarial training, 2021.
  67. Billion-scale semi-supervised learning for image classification, 2019.
  68. A fourier perspective on model robustness in computer vision. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
Citations (4)

Summary

  • The paper finds that inherent shape, spectral, and critical band biases correlate with nuanced generalization across in-distribution and out-of-distribution data.
  • The paper employs a fixed ResNet-50 architecture across 48 models with varied training methods like adversarial and contrastive training to isolate bias effects.
  • The paper reveals that balanced shape-texture bias and bandwidth traits impact robustness differently, informing strategies for model design.

Analyzing the Role of Biases in ImageNet Models and Their Influence on Generalization

The paper, "Can Biases in ImageNet Models Explain Generalization?" presents a comprehensive investigation into the hypothesis that biases inherent in ImageNet-trained neural networks affect their ability to generalize to unseen data. The paper focuses on biases such as shape bias, spectral bias, and critical band properties, which distinguish machine vision from human perception. The authors leverage a fixed ResNet-50 architecture across 48 ImageNet models trained using diverse methodologies to ascertain if these biases serve as predictors for robust model generalization.

Study Context and Motivation

The robust generalization of neural networks to rare, in-distribution (ID) samples or out-of-distribution (OOD) samples is a critical challenge. Despite advances in ImageNet classification, these models exhibit vulnerabilities to adversarial attacks and shifts in data distribution, such as differing weather conditions or digital artifacts. The paper scrutinizes various biases—previously identified differences between model and human vision—and evaluates their correlation with generalization capabilities.

Methodological Framework

The analysis meticulously fixed the architecture to ResNet-50 to neutralize confounding variables from different architectural inductive biases. Various training methodologies, such as augmentation techniques, adversarial training, contrastive learning, and recent training recipes, were employed to train the models. The authors aimed to dissociate the effects of training methods from architectural differences on generalization. To measure biases, they examined shape bias using the cue-conflict dataset, spectral biases via bandpass-filtered ImageNet samples, and critical band properties using noise insertions on contrast-reduced and grayscale samples.

The paper also explored various facets of generalization through benchmarks on in-distribution datasets, robustness datasets, conceptual changes using sketches and stylized images, and adversarial robustness.

Key Findings and Insights

The paper elucidates several significant insights, challenging previous notions about these biases:

  • Shape Bias: An inverse relationship was found between shape bias and ID performance. Models with stronger shape bias demonstrated lower performance on ID tasks. Moreover, adversarially trained models exhibited a balanced representation of shape and texture bias as optimal for robust performance.
  • Spectral Bias: A weak correlation was observed between low-frequency bias and generalization performance, while a surprising positive correlation was identified between high-frequency biases and various generalization aspects, except for adversarial robustness.
  • Critical Band: The critical band's bandwidth demonstrated correlations with robustness, yet the relationship was not directly causal. Models with narrower bandwidth showed better non-adversarial robustness, while broader bandwidths correlated with improved adversarial robustness, contrary to previous studies.

These findings highlight the complexity of generalization, suggesting that no single bias acts as a reliable predictor of generalization across diverse conditions. The discrepancies in correlations with adversarial training and across different training methodologies underscore the nuanced interaction between training, architectural design, and performance.

Implications and Future Directions

The inquiry into biases offers foundational insights with implications for the design of more robust AI models. It stresses the necessity of a holistic view, considering the multifaceted nature of generalization. As the authors advocate for enlarging methodological breadth, there lies a promising avenue for further research in devising mechanisms that accommodate bias alignment with human perception without compromising performance.

Future investigations could deepen understanding by exploring bias interactions and causal relationships in various architectures and implementing more comprehensive and adaptive benchmarks. The findings also emphasize the need for ongoing critical evaluations of models intended for safety-critical applications, where the cost of generalization failures can be substantial.

In conclusion, this paper adds invaluable nuance to the discourse surrounding model biases and their implications, foregrounding the complexity of model generalization. It raises salient considerations for AI development and evaluation and prompts reflective discourse on aligning machine learning models with human cognitive processes.

Youtube Logo Streamline Icon: https://streamlinehq.com