Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A data-centric approach to class-specific bias in image data augmentation (2403.04120v1)

Published 7 Mar 2024 in cs.CV

Abstract: Data augmentation (DA) enhances model generalization in computer vision but may introduce biases, impacting class accuracy unevenly. Our study extends this inquiry, examining DA's class-specific bias across various datasets, including those distinct from ImageNet, through random cropping. We evaluated this phenomenon with ResNet50, EfficientNetV2S, and SWIN ViT, discovering that while residual models showed similar bias effects, Vision Transformers exhibited greater robustness or altered dynamics. This suggests a nuanced approach to model selection, emphasizing bias mitigation. We also refined a "data augmentation robustness scouting" method to manage DA-induced biases more efficiently, reducing computational demands significantly (training 112 models instead of 1860; a reduction of factor 16.2) while still capturing essential bias trends.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. The effects of regularization and data augmentation are class dependent. In Advances in Neural Information Processing Systems, volume 35, pages 37878–37891, Curran Associates, Inc.
  2. Bishop, Christopher M and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning, volume 4. Springer.
  3. Describing textures in the wild. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
  4. Data augmentation for deep neural network acoustic modeling. IEEE/ACM Trans. Audio, Speech and Lang. Proc., 23(9):1469–1477.
  5. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255, Ieee.
  6. An image is worth 16x16 words: Transformers for image recognition at scale. CoRR, abs/2010.11929.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  8. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017.
  9. Computer vision algorithms and hardware implementations: A survey. Integration, 69:309–320.
  10. Deep Learning, chapter 5. MIT Press. http://www.deeplearningbook.org.
  11. Deep Learning, chapter 9. MIT Press. http://www.deeplearningbook.org.
  12. Deep Learning, chapter 6. MIT Press. http://www.deeplearningbook.org.
  13. Deep Learning, chapter 7. MIT Press. http://www.deeplearningbook.org.
  14. Deep residual learning. Image Recognition, 7.
  15. Densely connected convolutional networks. CoRR, abs/1608.06993.
  16. Kingma, Diederik P and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  17. Audio augmentation for speech recognition. In Sixteenth annual conference of the international speech communication association.
  18. Learning multiple layers of features from tiny images.
  19. Capsule networks – a survey. Journal of King Saud University - Computer and Information Sciences, 34(1):1295–1310.
  20. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995.
  21. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.
  22. Convolutional networks and applications in vision. In Proceedings of 2010 IEEE international symposium on circuits and systems, pages 253–256, IEEE.
  23. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10012–10022.
  24. Loshchilov, Ilya and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  25. Mehta, Sachin and Mohammad Rastegari. 2021. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. CoRR, abs/2110.02178.
  26. Dynamic routing between capsules. Advances in neural information processing systems, 30.
  27. Shalev-Shwartz, Shai and Shai Ben-David. 2014. Understanding machine learning: From theory to algorithms. Cambridge university press.
  28. Shorten, Connor and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data, 6(1):1–48.
  29. Tan, Mingxing and Quoc Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 6105–6114, PMLR.
  30. Tan, Mingxing and Quoc V. Le. 2021. Efficientnetv2: Smaller models and faster training. CoRR, abs/2104.00298.
  31. Taylor, Luke and Geoff Nitschke. 2018. Improving deep learning with generic data augmentation. In 2018 IEEE symposium series on computational intelligence (SSCI), pages 1542–1547, IEEE.
  32. Tihonov, Andrei Nikolajevits. 1963. Solution of incorrectly formulated problems and the regularization method. Soviet Math., 4:1035–1038.
  33. Tikhonov, Andrey Nikolayevich. 1943. On the stability of inverse problems. In Dokl. Akad. Nauk SSSR, volume 39, pages 195–198.
  34. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience, 2018.
  35. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
  36. Scaling vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12104–12113.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Athanasios Angelakis (5 papers)
  2. Andrey Rass (1 paper)

Summary

We haven't generated a summary for this paper yet.