High Quality Segmentation for Ultra High-resolution Images (2111.14482v3)

Published 29 Nov 2021 in cs.CV

Abstract: To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation. Common strategies, such as down-sampling, patch cropping, and cascade model, cannot address well the balance issue between accuracy and computation cost. Motivated by the fact that humans distinguish among objects continuously from coarse to precise levels, we propose the Continuous Refinement Model~(CRM) for the ultra high-resolution segmentation refinement task. CRM continuously aligns the feature map with the refinement target and aggregates features to reconstruct these images' details. Besides, our CRM shows its significant generalization ability to fill the resolution gap between low-resolution training images and ultra high-resolution testing ones. We present quantitative performance evaluation and visualization to show that our proposed method is fast and effective on image segmentation refinement. Code will be released at https://github.com/dvlab-research/Entity.

Citations (39)

View on Semantic Scholar

Summary

The paper presents the Continuous Refinement Model (CRM) that refines segmentation of ultra high-resolution images through continuous alignment of feature maps.
The model employs an implicit function and multi-resolution inference to boost IoU by 4.01% and mBA by 15.12% while processing images over twice as fast as previous methods.
The approach balances precision and computational cost, offering promising applications in areas like medical imaging and industrial defect detection.

High Quality Segmentation for Ultra High-resolution Images

The paper introduces a novel approach to the segmentation refinement of ultra high-resolution (UHR) images, presenting the Continuous Refinement Model (CRM). This approach aims to effectively balance accuracy and computational cost, a notable challenge when dealing with images at 4K and 6K resolutions.

Overview

Traditional methods such as down-sampling and patch cropping often fail to maintain a balance between detailed accuracy and computational efficiency in UHR image segmentation. CRM addresses these limitations by emulating the human process of object recognition, moving from coarse to fine details. The model utilizes a Continuous Alignment Module (CAM) that ensures feature maps are continuously aligned with refinement targets.

CRM surpasses existing discordant techniques by allowing for efficient processing without a cascade-based decoder, leveraging an implicit function to establish a continuous local representation. This representation is critical for bridging the resolution gap between lower-resolution training images and high-resolution testing.

Key Contributions

Continuous Refinement Model (CRM): By employing the CAM, CRM facilitates continuous alignment of feature maps, differentiating itself from cascade-based approaches that often entail heavy computational costs.
Implicit Function Utilization: CRM incorporates a pixel-wise implicit function to model the relationship between pixel positions and their predictions, enhancing feature extraction without bias.
Multi-resolution Inference: A strategic advantage, CRM uses a multi-resolution inference strategy to progressively refine segmentation outputs, enabling CRM to exhibit strong generalization capabilities. This is particularly notable as it refines images gradually from low to high resolution, enhancing details incrementally.

Performance and Results

CRM demonstrates superior efficacy on the BIG dataset, achieving improved Intersection over Union (IoU) and mean Boundary Accuracy (mBA) compared to state-of-the-art methods. Notably, CRM achieves an average improvement of 4.01% in IoU and 15.12% in mBA over baseline methods. The segmentation results from CRM not only demonstrate enhanced precision but are also produced more than twice as fast as previous top-performing methods.

Implications and Future Prospects

CRM’s ability to efficiently handle UHR images holds significant implications for fields such as medical imaging and industrial defect detection, where detail and accuracy are paramount. The successful integration of implicit functions suggests a potential avenue for further research in leveraging continuous feature maps across other computer vision tasks.

The paper indicates that the challenges surrounding UHR training remain, highlighting the substantial resource requirements for training CRM with ultra high-resolution datasets. Addressing these training limitations presents an opportunity for future research.

Conclusion

The proposed CRM method effectively innovates the process of UHR image segmentation by advancing the use of continuous feature alignment and implicit functions. Through continuous refinement and a novel multi-resolution inference strategy, CRM demonstrates both high precision and computational efficiency, setting a precedent for future segmentation refinement methodologies.

PDF Markdown

Related Papers

GitHub

GitHub - qqlu/Entity: EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation (676 stars)