Reinforced active learning for image segmentation (2002.06583v1)

Published 16 Feb 2020 in cs.CV

Abstract: Learning-based approaches for semantic segmentation have two inherent challenges. First, acquiring pixel-wise labels is expensive and time-consuming. Second, realistic segmentation datasets are highly unbalanced: some categories are much more abundant than others, biasing the performance to the most represented ones. In this paper, we are interested in focusing human labelling effort on a small subset of a larger pool of data, minimizing this effort while maximizing performance of a segmentation model on a hold-out set. We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL). An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled, from a pool of unlabeled data. The region selection decision is made based on predictions and uncertainties of the segmentation model being trained. Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems. We test the proof of concept in CamVid and provide results in the large-scale dataset Cityscapes. On Cityscapes, our deep RL region-based DQN approach requires roughly 30% less additional labeled data than our most competitive baseline to reach the same performance. Moreover, we find that our method asks for more labels of under-represented categories compared to the baselines, improving their performance and helping to mitigate class imbalance.

Authors (4)

Arantxa Casanova (9 papers)
Pedro O. Pinheiro (24 papers)
Negar Rostamzadeh (38 papers)
Christopher J. Pal (13 papers)

Citations (100)

View on Semantic Scholar

Summary

Reinforced Active Learning for Image Segmentation

This paper provides a novel approach to addressing the challenges inherent in semantic segmentation tasks, specifically focusing on reducing the labeling effort and mitigating class imbalance. The authors introduce an active learning strategy leveraging deep reinforcement learning (RL) to dynamically select image regions for annotation, rather than entire images. This region-based annotation approach is intended to optimize the use of human labeling resources by targeting the most informative segments of the dataset.

Methodology

The proposed method builds upon the deep Q-network (DQN) framework, adapting it for the large-scale and complex nature of semantic segmentation problems. The active learning process is framed as a Markov decision process (MDP), where an RL agent learns to prioritize regions based on their predicted contribution to segmentation performance, measured using Intersection over Union (IoU). This framework facilitates the efficient selection of image regions to be labeled, incorporating features that focus on class distributions and entropy-based uncertainty measures.

The authors detail their approach by defining clear stages within the MDP, emphasizing the importance of effective state and action representations in driving decision-making processes. The segmentation network f is trained iteratively in parallel with the RL agent, reinforcing the learning loop through rewards derived from performance improvements in segmentation tasks.

Experimental Results

The authors present comprehensive experiments on the Cam Vid and Cityscapes datasets, revealing significant efficiency improvements over established baselines, including BALD entropy-based selection and uniform sampling strategies. Notably, the RL-based approach achieves comparable segmentation accuracy with approximately 30% fewer labels than the most competitive baseline on the Cityscapes dataset. The selection strategy using the proposed batch-mode active learning further enhances computational efficiency by allowing parallel processing of labeled regions.

The numerical results substantiate the effectiveness of the RL strategy in targeting under-represented classes, potentially reducing class bias inherent in traditional segmentation datasets. The paper demonstrates improved performance particularly on challenging categories such as 'pedestrian' and 'bicycle', highlighting the method's ability to address class imbalance.

Implications

The implications for this work are manifold. Practically, it reduces the human and computational costs involved in semantic segmentation tasks, making the process more scalable and viable in real-world applications. This advancement is particularly significant in domains like autonomous driving and biomedical image analysis, where precise semantic segmentation is crucial.

Theoretically, this research adds depth to active learning methodologies by demonstrating the applicability of RL in dynamic sampling contexts. This opens opportunities for further exploration into RL frameworks in other high-dimensional tasks and datasets, potentially driving advancements across various AI domains.

Looking ahead, one can anticipate developments in domain adaptation to enhance the transferability of the policies learned from one dataset to another. Additionally, optimizing the definition of image regions could refine the effectiveness of this approach further, contributing to finer-grained segmentation models.

In conclusion, this paper presents a robust exploration of reinforced active learning within semantic segmentation, offering a promising direction for both practical applications and theoretical research in machine learning.

PDF Markdown

Related Papers

YouTube

Show All Videos