Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey (2405.01725v1)
Abstract: Deep learning has made significant progress in computer vision, specifically in image classification, object detection, and semantic segmentation. The skip connection has played an essential role in the architecture of deep neural networks,enabling easier optimization through residual learning during the training stage and improving accuracy during testing. Many neural networks have inherited the idea of residual learning with skip connections for various tasks, and it has been the standard choice for designing neural networks. This survey provides a comprehensive summary and outlook on the development of skip connections in deep neural networks. The short history of skip connections is outlined, and the development of residual learning in deep neural networks is surveyed. The effectiveness of skip connections in the training and testing stages is summarized, and future directions for using skip connections in residual learning are discussed. Finally, we summarize seminal papers, source code, models, and datasets that utilize skip connections in computer vision, including image classification, object detection, semantic segmentation, and image reconstruction. We hope this survey could inspire peer researchers in the community to develop further skip connections in various forms and tasks and the theory of residual learning in deep neural networks. The project page can be found at https://github.com/apple1986/Residual_Learning_For_Images
- Z. Wu, C. Shen, and A. Van Den Hengel, “Wider or deeper: Revisiting the ResNet model for visual recognition,” Pattern Recognition, vol. 90, pp. 119–133, 2019.
- Y. Chen, J. Li, H. Xiao, X. Jin, S. Yan, and J. Feng, “Dual path networks,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
- S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
- J. Behrmann, W. Grathwohl, R. T. Q. Chen, D. Duvenaud, and J. Jacobsen, “Invertible residual networks,” in Proceedings of the International Conference on Machine Learning, 2019, pp. 573–582.
- M. D. McDonnell, “Training wide residual networks for deployment using a single bit for each weight,” arXiv preprint arXiv:1802.08530, 2018.
- F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, and X. Tang, “Residual attention network for image classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3156–3164, 2017.
- Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, and Y. Li, “Maxvit: Multi-axis vision transformer,” in European conference on computer vision, pp. 459–479, Springer, 2022.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, 2022.
- J. Liu, X. Huang, G. Song, H. Li, and Y. Liu, “Uninet: Unified architecture search with convolution, transformer, and mlp,” in European Conference on Computer Vision, pp. 33–49, Springer, 2022.
- D. Han, J. Kim, and J. Kim, “Deep pyramidal residual networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5927–5935, 2017.
- Shi Yan & Liu, Y. DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs. ECCV. (2018).