Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weakly-Supervised Salient Object Detection via Scribble Annotations (2003.07685v1)

Published 17 Mar 2020 in cs.CV

Abstract: Compared with laborious pixel-wise dense labeling, it is much easier to label data by scribbles, which only costs 1$\sim$2 seconds to label one image. However, using scribble labels to learn salient object detection has not been explored. In this paper, we propose a weakly-supervised salient object detection model to learn saliency from such annotations. In doing so, we first relabel an existing large-scale salient object detection dataset with scribbles, namely S-DUTS dataset. Since object structure and detail information is not identified by scribbles, directly training with scribble labels will lead to saliency maps of poor boundary localization. To mitigate this problem, we propose an auxiliary edge detection task to localize object edges explicitly, and a gated structure-aware loss to place constraints on the scope of structure to be recovered. Moreover, we design a scribble boosting scheme to iteratively consolidate our scribble annotations, which are then employed as supervision to learn high-quality saliency maps. As existing saliency evaluation metrics neglect to measure structure alignment of the predictions, the saliency map ranking metric may not comply with human perception. We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps, which is more consistent with human perception. Extensive experiments on six benchmark datasets demonstrate that our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models. Our code and data is publicly available at https://github.com/JingZhang617/Scribble_Saliency.

Citations (234)

Summary

  • The paper introduces a novel framework that uses scribble annotations to reduce labeling time while training effective salient object detection models.
  • An auxiliary edge detection network and gated structure-aware loss are incorporated to enhance boundary precision in the resulting saliency maps.
  • Experimental results on six benchmark datasets show that the method outperforms existing weakly-supervised approaches and rivals fully-supervised models.

Weakly-Supervised Salient Object Detection via Scribble Annotations

The paper "Weakly-Supervised Salient Object Detection via Scribble Annotations" explores a novel approach to salient object detection (SOD) using weak supervision in the form of scribble annotations. While traditional SOD methods rely heavily on labor-intensive pixel-wise annotations, this paper leverages the efficiency of scribbles, which can be created in only 1-2 seconds per image.

Key Contributions

  1. Scribble-Annotated Dataset: The authors present the S-DUTS dataset, a relabeled version of the existing DUTS dataset, annotated with scribbles. This dataset facilitates the training of weakly-supervised SOD models without the need for dense annotations.
  2. Auxiliary Edge Detection Task: Given the challenge that scribble annotations do not capture object boundaries well, the paper proposes an auxiliary edge detection network. This network aids the model in localizing object edges more effectively, improving boundary accuracy in saliency maps.
  3. Gated Structure-Aware Loss: The introduction of a gated structure-aware loss enforces the structure of the predicted saliency maps to align with the image edges, focusing the network’s attention on the salient regions.
  4. Scribble Boosting Scheme: A novel iterative strategy called the scribble boosting scheme is presented, which helps refine and consolidate the scribble annotations into more comprehensive supervisory signals for training.
  5. Saliency Structure Measure: The authors propose a new evaluation metric, the saliency structure measure (BμB_\mu), specifically designed to assess the alignment of predicted saliency maps with the human perception of structural consistency.

Experimental Validation

The authors conducted extensive experiments on six benchmark datasets, demonstrating that their method surpasses existing weakly-supervised and unsupervised SOD methods in both qualitative and quantitative evaluations, and rivals some state-of-the-art fully-supervised models. Notably, the proposed method achieved favorable results across metrics such as Mean Absolute Error, F-measure, E-measure, and the newly introduced BμB_\mu.

Implications and Future Directions

The proposed framework significantly reduces the time and effort required for data annotation in salient object detection by utilizing scribble annotations. The integration of an edge detection task and the development of a structure-aware loss represent critical innovations for boundary-aware SOD models. Further enhancements in scribble annotation techniques or alternative forms of weak supervision can propel this framework to broader applications in computer vision tasks where precise annotation is a bottleneck.

Given the promising results of this approach, future research might investigate the extension of scribble-based weak supervision to more complex multi-object scenes and finer details in object segmentation. Another avenue for exploration could be the automated generation of scribble annotations to further reduce human effort, possibly leveraging semi-supervised learning techniques.

This work marks a significant step toward more computationally efficient and less resource-intensive methods for salient object detection, opening up possibilities for application in dynamic environments and rapidly changing fields where labeled data are scarce or evolving.