Embedding Structured Contour and Location Prior in Siamesed Fully Convolutional Networks for Road Detection (1905.01575v1)

Published 5 May 2019 in cs.CV

Abstract: Road detection from the perspective of moving vehicles is a challenging issue in autonomous driving. Recently, many deep learning methods spring up for this task because they can extract high-level local features to find road regions from raw RGB data, such as Convolutional Neural Networks (CNN) and Fully Convolutional Networks (FCN). However, how to detect the boundary of road accurately is still an intractable problem. In this paper, we propose a siamesed fully convolutional networks (named as ``s-FCN-loc''), which is able to consider RGB-channel images, semantic contours and location priors simultaneously to segment road region elaborately. To be specific, the s-FCN-loc has two streams to process the original RGB images and contour maps respectively. At the same time, the location prior is directly appended to the siamesed FCN to promote the final detection performance. Our contributions are threefold: (1) An s-FCN-loc is proposed that learns more discriminative features of road boundaries than the original FCN to detect more accurate road regions; (2) Location prior is viewed as a type of feature map and directly appended to the final feature map in s-FCN-loc to promote the detection performance effectively, which is easier than other traditional methods, namely different priors for different inputs (image patches); (3) The convergent speed of training s-FCN-loc model is 30\% faster than the original FCN, because of the guidance of highly structured contours. The proposed approach is evaluated on KITTI Road Detection Benchmark and One-Class Road Detection Dataset, and achieves a competitive result with state of the arts.

Authors (3)

Qi Wang (561 papers)
Junyu Gao (63 papers)
Yuan Yuan (234 papers)

Citations (240)

View on Semantic Scholar

Summary

The paper introduces a Siamese FCN architecture that fuses RGB and contour map data to enhance road boundary segmentation.
It incorporates two-channel spatial priors to reduce misdetections by leveraging recurring positional patterns in urban scenes.
The method achieved a Max F-measure of 93.26% on KITTI and improved training speed by 30%, demonstrating superior performance.

Embedding Structured Contour and Location Prior in Siamesed Fully Convolutional Networks for Road Detection

The complexity of road detection within intelligent transportation systems (ITS) predominantly arises from the intricate challenge of accurately delineating road boundaries. The paper presented by Wang, Gao, and Yuan introduces an innovative approach employing a siamesed fully convolutional network (s-FCN-loc) to tackle this issue. This model uniquely leverages both structured contour and location priors to enhance road detection tasks using deep learning methodologies.

Siamesed Fully Convolutional Network (s-FCN-loc)

The core contribution of this paper is the development of a siamesed FCN architecture, which integrates two significant types of information: RGB-channel image data and contour maps. By employing parallel convolutional streams, the s-FCN-loc enhances the discriminative capacity of road boundary features, thereby improving segmentation accuracy. Traditional FCN models are equipped to handle complex pixel labeling tasks; however, they often fall short in accurately capturing spatial structures. This is where the s-FCN-loc system excels, integrating structured contour data to rectify these shortcomings.

Incorporation of Location Priors

In addition to contour-based enhancements, s-FCN-loc incorporates location priors designed as two-channel feature maps describing spatial distributions. This allows the network to account for recurring positional patterns within street scenes—such as roads predominantly found at the bottom of images—thereby reducing incorrect detections. The approach of directly appending these priors within the convolutional framework is an efficient strategy, streamlining the preprocessing steps typically associated with location-based models.

Performance Evaluation

The proposed method was subjected to rigorous testing using the KITTI road detection benchmark and the One-Class Road Detection Dataset. The results demonstrate significant improvements over existing deep learning architectures, in particular a 30% increase in training speed due to the network's inherent design advantages. Further analysis reveals the s-FCN-loc outperforming other cutting-edge methods by achieving a competitive placement on the KITTI leaderboard with a Max F-measure of 93.26%.

Implications and Future Work

The dual-stream configuration proposed in this paper presents a compelling case for further exploration of siamesed architectures within other pixel prediction paradigms, extending potential applications beyond road detection to general semantic segmentation tasks. The inclusion of higher order contour features not only enhances current FCN capabilities but may also propose a new standard for models focusing on environmental and spatial recognition.

The research paves the way for additional studies examining the balance and interaction between high-level structured data and more traditional image features. Future work may seek to adapt similar principles to different scene contexts, enhancing model robustness across varying urban planning and infrastructure designs. The convergence dynamics in large-scale environments might also be an avenue for optimization, with potential adaptations in network architecture to maximize the utility of contour information. The presented methodology thus sets a foundation for evolving road detection systems towards improved precision and efficiency.

PDF Markdown