The Utility of Feature Reuse: Transfer Learning in Data-Starved Regimes
Abstract: The use of transfer learning with deep neural networks has increasingly become widespread for deploying well-tested computer vision systems to newer domains, especially those with limited datasets. We describe a transfer learning use case for a domain with a data-starved regime, having fewer than 100 labeled target samples. We evaluate the effectiveness of convolutional feature extraction and fine-tuning of overparameterized models with respect to the size of target training data, as well as their generalization performance on data with covariate shift, or out-of-distribution (OOD) data. Our experiments demonstrate that both overparameterization and feature reuse contribute to the successful application of transfer learning in training image classifiers in data-starved regimes. We provide visual explanations to support our findings and conclude that transfer learning enhances the performance of CNN architectures in data-starved regimes.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Very deep convolutional networks for large-scale image recognition. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
- Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269, July 2017.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, June 2016.
- Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning. CoRR, abs/1711.05225, 2017.
- Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. Investigative Ophthalmology and Visual Science, 57(13):5200–5206, 10 2016.
- Learning with augmented features for heterogeneous domain adaptation. arXiv preprint arXiv:1206.4660, 2012.
- Heterogeneous defect prediction. In Proceedings of the 2015 10th joint meeting on foundations of software engineering, pages 508–519, 2015.
- Transfusion: Understanding transfer learning for medical imaging. In Advances in Neural Information Processing Systems, pages 3342–3352, 2019.
- Overparameterized nonlinear learning: Gradient descent takes the shortest path? In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4951–4960, Long Beach, California, USA, 09–15 Jun 2019. PMLR.
- An improved analysis of training over-parameterized deep neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlcheBuc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 2055–2064. Curran Associates, Inc., 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- An out-of-the-box full-network embedding for convolutional neural networks. In 2018 IEEE International Conference on Big Knowledge (ICBK), pages 168–175. IEEE, 2018.
- On the behavior of convolutional nets for feature extraction. Journal of Artificial Intelligence Research, 61:563–592, 2018.
- Dawnbench: An end-to-end deep learning benchmark and competition. Training, 100(101):102, 2017.
- Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging, 35(5):1299–1312, 2016.
- Transfer learning and deep feature extraction for planktonic image data sets. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1082–1088. IEEE, 2017.
- Rethinking the inception architecture for computer vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, June 2016.
- Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size. CoRR, abs/1602.07360, 2016.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
- ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), pages 80–89. IEEE, 2018.
- Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
- Attention-driven cross-modal remote sensing image retrieval. In 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pages 4783–4786. IEEE, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.