Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey (2001.04074v3)

Published 13 Jan 2020 in cs.CV

Abstract: From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentation. The combined version of these two basic tasks is known as panoptic segmentation. In the recent era, the success of deep convolutional neural networks (CNN) has influenced the field of segmentation greatly and gave us various successful models to date. In this survey, we are going to take a glance at the evolution of both semantic and instance segmentation work based on CNN. We have also specified comparative architectural details of some state-of-the-art models and discuss their training details to present a lucid understanding of hyper-parameter tuning of those models. We have also drawn a comparison among the performance of those models on different datasets. Lastly, we have given a glimpse of some state-of-the-art panoptic segmentation models.

Citations (229)

View on Semantic Scholar

Summary

The paper provides a comprehensive review of CNN-based image segmentation evolution, detailing semantic, instance, and panoptic approaches.
It evaluates advanced architectures like FCN, U-Net, and Mask R-CNN, emphasizing improvements in pixel accuracy and computational efficiency.
The study highlights real-world implications, comparing optimization techniques on benchmark datasets for applications in autonomous driving and medical diagnosis.

Insights on "Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey"

The paper "Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey" presents a detailed review of the development of convolutional neural network (CNN)-based models for image segmentation tasks. Image segmentation is a foundational and complex task in computer vision, pivotal for applications such as autonomous vehicles and medical diagnosis. This survey explores semantic, instance, and the emergent panoptic segmentation, providing a comprehensive evaluation of various state-of-the-art models and their evolution over time.

Overview of Image Segmentation Types

The survey first categorizes image segmentation into three primary types: semantic segmentation, instance segmentation, and panoptic segmentation. Semantic segmentation involves labeling each pixel of an image with a corresponding class, whereas instance segmentation not only classifies each pixel but also distinguishes between separate instances of a class. Panoptic segmentation, in turn, is a comprehensive task that integrates both semantic and instance segmentation.

Evolution of Semantic Segmentation

The discussion in the survey begins with semantic segmentation models where CNNs have achieved substantial success. The evolution is traced back to early models like R-CNN, introducing CNNs for segmenting instances within bounding-box proposals. The Fully Convolutional Network (FCN) marked a significant shift by adapting traditional CNNs for pixel-level tasks by replacing fully connected layers with convolutional ones, facilitating end-to-end training. FCN's successors, such as U-Net, SegNet, and DeepLab, incorporated various strategies like encoder-decoder architectures, dilation convolutions, and pyramid pooling, tackling the inherent limitations of FCN models by improving the handling of spatial information and multi-scale context.

Advancements in Instance Segmentation

Instance segmentation models mirror the progression seen in object detection. Starting with frameworks rooted in object detection models, such as Fast R-CNN and Faster R-CNN, the task evolved through the integration of segmentation masks. Models like DeepMask and Mask R-CNN incorporated mask proposal networks to improve pixel accuracy and segmentation efficiency. Moreover, innovations like position-sensitive score maps introduced in InstanceFCN emphasized contextualizing feature maps to distinguish between instances more effectively.

Emergence of Panoptic Segmentation

The survey recognizes panoptic segmentation as a merging of previously discrete tasks. Panoptic segmentation models aim to simultaneously achieve the objectives of both semantic and instance segmentation. Recent approaches, such as UPSNet and OANet, extend existing segmentation models by integrating components for semantic and instance segmentation into a unified architecture, reflecting an emerging trend towards holistic scene understanding models.

Training Approaches and Comparative Analysis

The survey provides an invaluable comparative analysis of the optimization techniques and hyperparameters employed across various models. This includes choices of learning rates, batch sizes, optimizer variations, and data augmentation strategies. Such analyses are indispensable for understanding the fine-tuning necessary to achieve state-of-the-art performance on benchmark datasets like PASCAL VOC and MS COCO. Through tables and succinct comparisons, the survey sheds light on performance benchmarks achieved by notable models, emphasizing advancements in accuracy and computational efficiency.

Implications and Future Directions

The survey's detailed taxonomy and analysis underscore key trends in image segmentation and highlight areas for further exploration. One of the notable implications is the continuous move towards more integrated systems, evident in the development of panoptic segmentation techniques. Additionally, the demand for real-time segmentation in applications like autonomous driving prompts research into lightweight and faster model variants. Future work will likely explore more efficient architectures and training strategies that balance predictive accuracy and resource constraints.

Conclusion

In conclusion, the survey by Sultana, Sufian, and Dutta provides a critical assessment of CNN-driven progress in image segmentation, offering a roadmap of the technological advancements that have shaped this domain. By cataloging significant models and their contributions, the survey not only outlines a historical trajectory but also suggests prospective avenues of research and development in this rapidly evolving field.

PDF Markdown