Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey on Instance Segmentation: State of the art (2007.00047v1)

Published 28 Jun 2020 in cs.CV, cs.LG, and eess.IV

Abstract: Object detection or localization is an incremental step in progression from coarse to fine digital image inference. It not only provides the classes of the image objects, but also provides the location of the image objects which have been classified. The location is given in the form of bounding boxes or centroids. Semantic segmentation gives fine inference by predicting labels for every pixel in the input image. Each pixel is labelled according to the object class within which it is enclosed. Furthering this evolution, instance segmentation gives different labels for separate instances of objects belonging to the same class. Hence, instance segmentation may be defined as the technique of simultaneously solving the problem of object detection as well as that of semantic segmentation. In this survey paper on instance segmentation -- its background, issues, techniques, evolution, popular datasets, related work up to the state of the art and future scope have been discussed. The paper provides valuable information for those who want to do research in the field of instance segmentation.

Citations (369)

Summary

  • The paper provides a taxonomy of instance segmentation techniques that combine object detection and segmentation for precise instance labeling.
  • It details methodologies like Mask R-CNN and dense sliding window approaches, highlighting advances in accuracy and computational challenges.
  • The survey emphasizes diverse datasets and calls for enhanced model efficiency to better address issues like small object detection and occlusions.

An Overview of Instance Segmentation: State of the Art

The document under review is a comprehensive survey of instance segmentation, an intricate domain within computer vision. It provides an extensive exploration of the evolution, methodologies, datasets, and challenges associated with instance segmentation. The paper meticulously traces the advancements from object classification and localization to semantic and finally, instance segmentation.

Instance segmentation distinguishes itself by offering distinct labels for separate instances of objects belonging to the same class, effectively combining the tasks of object detection and semantic segmentation. The authors catalog a variety of instance segmentation techniques and introduce a taxonomy of methods along with a timeline of significant advancements in the field.

Instance Segmentation Techniques

The survey delineates multiple instance segmentation techniques, including:

  1. Classification of Mask Proposals: This traditional approach involves generating mask proposals, followed by classification. The RCNN family is central to this category, illustrating the transition from feature extraction via selective search to more advanced CNN-based architectures.
  2. Detection Followed by Segmentation: Techniques like Mask R-CNN define this popular approach, where object detection is followed by mask refinement. Mask R-CNN notably extends Faster R-CNN by adding a parallel branch for mask prediction, significantly enhancing segmentation accuracy.
  3. Labelling Pixels Followed by Clustering: This method adapts semantic segmentation networks to assign categories at the pixel level and then clusters them into distinct instances. This approach frequently struggles with computational intensity and real-time applicability.
  4. Dense Sliding Window Methods: This recent innovation generates masks using dense probabilities across the spatial dimensions, with TensorMask exemplifying this approach by efficiently managing geometric data within four-dimensional tensors.

Datasets

The survey also emphasizes critical datasets such as the Microsoft COCO and Cityscapes datasets, which provide large-scale, diverse annotated images crucial for training and benchmarking instance segmentation models. These datasets facilitate improvements in accuracy and robustness by offering varied real-world scenarios.

Challenges and Future Directions

Despite advancements, instance segmentation remains computationally demanding, especially for real-time applications. The authors highlight challenges such as the detection of small objects, handling occlusions, geometric transformations, and varying object scales. Furthermore, they point to ongoing developments in model efficiency and the need for more adaptable, autonomous fine-tuning of neural network architectures.

The survey suggests that the future of instance segmentation will likely emphasize reducing computational complexity and enhancing model adaptability to real-world conditions. The incorporation of techniques like non-local neural networks and advancements in backbone architectures such as GCNet and PANet play pivotal roles in addressing these issues.

Conclusion

This survey serves as a critical resource for researchers in the domain of computer vision, offering an in-depth examination of instance segmentation. By outlining significant methodologies, datasets, and the evolution of techniques, it sets the stage for ongoing research and development aimed at improving the efficiency and applicability of instance segmentation in real-world applications.