Dynamic Feature Fusion for Semantic Edge Detection (1902.09104v1)

Published 25 Feb 2019 in cs.CV

Abstract: Features from multiple scales can greatly benefit the semantic edge detection task if they are well fused. However, the prevalent semantic edge detection methods apply a fixed weight fusion strategy where images with different semantics are forced to share the same weights, resulting in universal fusion weights for all images and locations regardless of their different semantics or local context. In this work, we propose a novel dynamic feature fusion strategy that assigns different fusion weights for different input images and locations adaptively. This is achieved by a proposed weight learner to infer proper fusion weights over multi-level features for each location of the feature map, conditioned on the specific input. In this way, the heterogeneity in contributions made by different locations of feature maps and input images can be better considered and thus help produce more accurate and sharper edge predictions. We show that our model with the novel dynamic feature fusion is superior to fixed weight fusion and also the na\"ive location-invariant weight fusion methods, via comprehensive experiments on benchmarks Cityscapes and SBD. In particular, our method outperforms all existing well established methods and achieves new state-of-the-art.

Citations (61)

View on Semantic Scholar

Summary

Dynamic Feature Fusion for Semantic Edge Detection

Semantic Edge Detection (SED) has emerged as a vital task in computer vision, aimed at precisely detecting edges in images and assigning semantic labels to these boundaries. The paper "Dynamic Feature Fusion for Semantic Edge Detection" introduces a novel approach to enhance SED by dynamically adapting fusion weights for multi-scale features, addressing limitations of fixed weight fusion strategies commonly utilized in existing models such as CASENet, SEAL, and DDS.

The authors propose Dynamic Feature Fusion (DFF), which employs an innovative feature fusion method leveraging adaptive fusion weights tailored to each image's specific content and local context. This approach contrasts sharply with traditional methods that apply universal fusion weights irrespective of the image variations or semantic context of each pixel.

Key Contributions

Dynamic Fusion Strategy: The paper presents a dynamic feature fusion strategy that enables adaptive learning of fusion weights for individual image locations. This methodology is achieved through a location-adaptive weight learner, which dynamically adjusts the fusion weights based on the feature map content, significantly enhancing edge prediction accuracy and sharpness.
Normalizer Module: The introduction of a feature extractor with a normalizer helps scale multi-level responses to a similar magnitude. This scaling is essential in eliminating bias towards higher-level activation maps, ensuring low-level features can effectively contribute to detecting fine edge details.
Improved Performance: Comprehensive experiments on Cityscapes and SBD benchmarks demonstrate that the DFF model consistently outperforms state-of-the-art models. The reported MF scores reveal the efficacy of the dynamic feature fusion method in a more precise localization of object boundaries, with marked improvements over CASENet, DDS, and SEAL.

Numerical Results

The paper reports a mean F-measure (MF) score of 80.7% on the Cityscapes dataset with a matching distance tolerance of 0.02, surpassing DDS by 2.7% and CASENet by 9.4%. Under stricter matching distance conditions (0.0035), DFF achieves a 5% higher MF score compared to SEAL, underscoring its capability in accurately capturing edge details. On the SBD dataset, DFF achieves an MF score of 75.4%, proving its robustness across different datasets.

Implications and Future Directions

The dynamic feature fusion strategy proposed in this paper holds significant implications for future SED research and development. By transitioning from fixed to dynamic fusion weights, SED models can better accommodate image variability and local semantic context, potentially revolutionizing edge detection tasks in complex visual environments. Additionally, the adaptability of DFF could be extended to other vision tasks, leading to improvements in object segmentation, instance segmentation, and other pixel-level prediction tasks.

As the field progresses, further integration of context-aware adaptive mechanisms can be explored, potentially involving advanced learning algorithms that refine feature utilization dynamically during inference. Such developments could pave the way for more computationally efficient and effective SED models, catering to various application domains in artificial intelligence and computer vision.

In conclusion, the paper establishes dynamic feature fusion as a promising advancement in semantic edge detection, offering a blend of theoretical insights and practical improvements that facilitate sharper, more accurate edge delineation in diverse image sets.

Dynamic Feature Fusion for Semantic Edge Detection (1902.09104v1)

Summary

Dynamic Feature Fusion for Semantic Edge Detection

Key Contributions

Numerical Results

Implications and Future Directions

Follow-up Questions

Authors (4)

YouTube

Dynamic Feature Fusion for Semantic Edge Detection (1902.09104v1)

Summary

Dynamic Feature Fusion for Semantic Edge Detection

Key Contributions

Numerical Results

Implications and Future Directions

Follow-up Questions

Related Papers

Authors (4)

YouTube