Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection (1909.01955v2)

Published 4 Sep 2019 in cs.CV and cs.LG

Abstract: This paper proposes a Deep Learning based edge detector, which is inspired on both HED (Holistically-Nested Edge Detection) and Xception networks. The proposed approach generates thin edge-maps that are plausible for human eyes; it can be used in any edge detection task without previous training or fine tuning process. As a second contribution, a large dataset with carefully annotated edges has been generated. This dataset has been used for training the proposed approach as well the state-of-the-art algorithms for comparisons. Quantitative and qualitative evaluations have been performed on different benchmarks showing improvements with the proposed method when F-measure of ODS and OIS are considered.

Citations (259)

View on Semantic Scholar

Summary

The paper introduces DexiNed, a novel CNN architecture that uses dense inception modules to achieve precise edge detection.
It integrates multi-scale feature learning with a specialized upsampling block to enhance edge continuity and accuracy.
Evaluations on the BIPED and other datasets show that DexiNed outperforms state-of-the-art methods with improved F-measure scores.

Dense Extreme Inception Network: A Robust CNN Model for Edge Detection

The paper "Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection" by Xavier Soria, Edgar Riba, and Angel Sappa introduces a novel approach to edge detection in images using a deep learning framework. Drawing inspiration from established models like Holistically-Nested Edge Detection (HED) and Xception networks, the authors propose a Convolutional Neural Network (CNN) architecture — the Dense Extreme Inception Network (DexiNed) — designed to produce high-fidelity edge maps suitable for computer vision applications. This framework eliminates the need for model-specific training or fine-tuning across different datasets, offering a universal approach to edge detection tasks.

Key Contributions

The paper makes several notable contributions to the field of computer vision:

DexiNed Architecture: The proposed CNN architecture includes dense connections facilitating multi-scale feature learning and a specialized upsampling block to refine edge predictions. The architecture eliminates the necessity for pre-trained weights, allowing it to be trained from scratch.
Edge Detection Dataset: The authors introduce and release the Barcelona Images for Perceptual Edge Detection (BIPED), a dataset with 250 high-resolution images meticulously annotated for edge detection tasks. BIPED serves both as a training ground and evaluation benchmark for comparing various edge detection algorithms, given its comprehensive annotation quality.

Methodology and Architecture

DexiNed's architecture departs from typical CNNs by implementing an inception-based approach with dense interconnections, which are beneficial for retaining edge features across different scales. It comprises two main sub-networks: the Dense Extreme Inception network (Dexi) and an upsampling block. The Dexi consists of six main processing blocks, each yielding feature maps that interact with the upsampling block to generate intermediate and final edge maps.

The upsampling block employs a combination of learned and transpose convolutional layers to incrementally upscale feature maps to the resolution of the input image. This feature enables DexiNed to preserve the thinness and precision of edges, which are critical for applications where delineation accuracy is paramount.

Evaluation and Results

The model's performance was assessed across multiple datasets, both edge-focused and those aimed at boundary detection. DexiNed demonstrated superior performance in edge detection tasks, notably achieving higher F-measure scores on the BIPED dataset compared to state-of-the-art techniques, including HED, RCF, and BDCN. Evaluation metrics such as ODS, OIS, and AP consistently showed enhancement over these leading models, showcasing DexiNed's efficacy in generating coherent and visually plausible edge-maps.

Interestingly, the model — trained solely on BIPED — also showed competitive performance on datasets not originally intended for edge detection, like BSDS and NYUD. This aspect underscores DexiNed’s potential adaptability and generalization capability in diverse image domains.

Implications and Future Directions

The practical implications of DexiNed lie in its robustness and versatility for edge detection in varied applications, spanning medical imaging to remote sensing. Its architecture, by fostering seamless integration of multi-scale features and avoiding dependency on pre-trained weights, serves as a reference point for developing future advanced models in computer vision.

Moving forward, the line of research could explore the adaptation of DexiNed's architecture to related tasks such as object boundary detection or semantic segmentation. Additionally, enhancing the interpretability and efficiency of the model could further advance its applicability, particularly in resource-limited environments.

In summary, this research marks significant progress in leveraging deep learning for edge detection, offering a comprehensive methodology backed by thorough empirical evaluation and a well-constructed dataset for the community.

PDF Markdown