Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EGNet:Edge Guidance Network for Salient Object Detection (1908.08297v1)

Published 22 Aug 2019 in cs.CV

Abstract: Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, most existing FCNs-based methods still suffer from coarse object boundaries. In this paper, to solve this problem, we focus on the complementarity between salient edge information and salient object information. Accordingly, we present an edge guidance network (EGNet) for salient object detection with three steps to simultaneously model these two kinds of complementary information in a single network. In the first step, we extract the salient object features by a progressive fusion way. In the second step, we integrate the local edge information and global location information to obtain the salient edge features. Finally, to sufficiently leverage these complementary features, we couple the same salient edge features with salient object features at various resolutions. Benefiting from the rich edge information and location information in salient edge features, the fused features can help locate salient objects, especially their boundaries more accurately. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on six widely used datasets without any pre-processing and post-processing. The source code is available at http: //mmcheng.net/egnet/.

Citations (855)

Summary

  • The paper proposes a novel network that explicitly models both salient object and edge features to preserve precise boundaries.
  • It achieves superior performance by jointly optimizing edge and object detection tasks, resulting in a 1.9% mean F-measure improvement over competitors.
  • The architecture employs specialized modules for multi-resolution feature extraction and effective feature fusion, setting a new standard for SOD.

EGNet: Edge Guidance Network for Salient Object Detection

EGNet presents a sophisticated approach focused on enhancing the precision and accuracy of Salient Object Detection (SOD). Recognizing the limitations of traditional Fully Convolutional Networks (FCNs) that often result in coarse object boundaries, this paper introduces a novel Edge Guidance Network (EGNet) to better exploit the complementarity between salient edge information and salient object information within a unified neural architecture.

Key Contributions

This paper makes three primary contributions to the domain:

  1. Explicit Modeling of Edge and Object Information: The authors propose an architecture that explicitly models both salient object information and salient edge information in order to preserve object boundaries. This dual modeling helps in achieving more accurate SOD outcomes.
  2. Joint Optimization of Complementary Tasks: By jointly optimizing the tasks of edge detection and object detection, the network allows these complementary features to mutually benefit each other. This leads to an enhancement in the quality of predicted saliency maps.
  3. Comprehensive Evaluation and Superior Performance: EGNet is compared against 15 state-of-the-art methods across six established datasets. The proposed method consistently achieves the best performance across three evaluation metrics without requiring any pre-processing or post-processing steps.

Methodology

Architecture

The backbone of EGNet employs a modified VGG network coupled with new modules for feature extraction:

  • Salient Object Feature Extraction: Using a Progressive Salient Object Feature Extraction Module (PSFEM), EGNet captures multi-resolution salient object features reminiscent of U-Net. This step ensures the collection of rich contextual information.
  • Salient Edge Feature Extraction: In parallel, a Non-Local Salient Edge Features Extraction Module (NLSEM) is introduced. This module integrates local edge information from lower layers with global location information propagated top-down, thus retaining both edge detail and spatial coherence.

Complementary Information Fusion

To effectively merge the extracted features, the One-to-One Guidance Module (O2OGM) is employed. This module directly couples salient edge features with multi-resolution object features, leveraging the complementary nature of these features to enhance both segmentation and localization accuracy. This approach contrasts with conventional progressive fusion methods and provides superior results by maintaining the distinct contributions of edge information throughout the layers.

Experimental Results

The comprehensive evaluation on public benchmarks (ECSSD, PASCAL-S, DUT-OMRON, SOD, HKU-IS, DUTS) confirms EGNet’s superior performance. For example, EGNet achieves a mean F-measure improvement of 1.9% over the next best approach (PiCANet) across these datasets. Additionally, EGNet shows considerable reductions in Mean Absolute Error (MAE) and improvements in S-measure, particularly in challenging datasets like SOD and DUTS.

Implications

The results of this research indicate potential improvements for applications relying on accurate object detection within images, such as autonomous driving, medical imaging diagnostics, and augmented reality. The dual-focus on both boundary precision and region saliency provides a more holistic and accurate detection paradigm, thus pushing the boundaries of what is achievable within the SOD task.

Future Directions

Future research can explore the adaptability of EGNet to other backbone architectures, potentially enhancing its performance further. Additionally, the application of similar edge-guided methodologies could be extended to other computer vision tasks that necessitate precise boundary delineation, such as semantic segmentation and object instance segmentation.

By leveraging the complementary aspects of edge and object information, EGNet sets a new standard in salient object detection, laying groundwork for more nuanced and accurate vision systems in various practical applications.