RESA: Recurrent Feature-Shift Aggregator for Lane Detection (2008.13719v2)

Published 31 Aug 2020 in cs.CV

Abstract: Lane detection is one of the most important tasks in self-driving. Due to various complex scenarios (e.g., severe occlusion, ambiguous lanes, etc.) and the sparse supervisory signals inherent in lane annotations, lane detection task is still challenging. Thus, it is difficult for the ordinary convolutional neural network (CNN) to train in general scenes to catch subtle lane feature from the raw image. In this paper, we present a novel module named REcurrent Feature-Shift Aggregator (RESA) to enrich lane feature after preliminary feature extraction with an ordinary CNN. RESA takes advantage of strong shape priors of lanes and captures spatial relationships of pixels across rows and columns. It shifts sliced feature map recurrently in vertical and horizontal directions and enables each pixel to gather global information. RESA can conjecture lanes accurately in challenging scenarios with weak appearance clues by aggregating sliced feature map. Moreover, we propose a Bilateral Up-Sampling Decoder that combines coarse-grained and fine-detailed features in the up-sampling stage. It can recover the low-resolution feature map into pixel-wise prediction meticulously. Our method achieves state-of-the-art results on two popular lane detection benchmarks (CULane and Tusimple). Code has been made available at: https://github.com/ZJULearning/resa.

Authors (7)

Tu Zheng (11 papers)
Hao Fang (88 papers)
Yi Zhang (994 papers)
Wenjian Tang (2 papers)
Zheng Yang (69 papers)
Haifeng Liu (56 papers)
Deng Cai (181 papers)

Citations (221)

View on Semantic Scholar

Summary

An Evaluation of RESA: Recurrent Feature-Shift Aggregator for Lane Detection

The paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection" presents an advanced approach for lane detection, a critical task for autonomous driving systems. The authors introduce an innovative module called the Recurrent Feature-Shift Aggregator (RESA), designed to enhance lane features following initial extractions using conventional Convolutional Neural Networks (CNNs). This module constructs a novel method of enforcing spatial relationships and pixel aggregation across images, distinctly improving lane detection particularly under challenging conditions, such as severe occlusion or ambiguous lane markings.

RESA leverages strong a priori lane shapes and spatial relationships among pixels by shifting slices of feature maps in both vertical and horizontal directions. This technique facilitates each pixel in accessing global information, allowing for more accurate lane inference under conditions of weak visual cues. The implications of this capability are significant, providing enhanced robustness against environmental challenges like occlusion caused by vehicles, challenging lighting, and complex lane geometries.

A core component of the framework detailed in the paper is the Bilateral Up-Sampling Decoder (BUSD), used for processing features into pixel-wise predictions. It employs dual branches for coarse-grained and fine-detail processing, ensuring that lane features are accurately depicted even at the pixel level. This differs from conventional approaches which might not effectively exploit high-resolution features required for lane detection.

The effectiveness of the method was empirically validated on two well-known datasets: the CULane and TuSimple, where it achieved state-of-the-art results. Specifically, the method realized a 75.3% F1-score on the CULane dataset and a 96.8% accuracy on the TuSimple dataset. These results are particularly noteworthy given the typical complexities of these benchmarks, which include varied traffic scenarios and environmental conditions.

The paper also contrasts RESA with existing spatial information utilization methods, notably SCNN. The comparison highlights RESA’s computational efficiency due to its parallel information propagation mechanism, contrasting with SCNN's sequential approach which incurs greater inference times.

RESA's architecture, compatible with diverse encoder frameworks, positions it as a versatile tool that could readily integrate with existing ADAS systems or standalone vehicle autonomy solutions. The modularity and computation speed advantage compared to previous methods suggest it can serve as a baseline for future endeavors in lane detection tasks.

The authors' approach sidesteps substantial computational overheads while promising scalability to applications that involve complex road scenarios, thus broadening the potential usability in real-time systems. In the context of ongoing technological advances in autonomous driving, an area ripe for innovation and research, RESA's methodology opens avenues for refining perception systems.

Looking forward, this research could fuel the development of more adaptive lane detection systems that can dynamically adjust to new datasets and driving conditions. Further potential lies in extending RESA's framework to additional perception tasks like obstacle detection or traffic sign recognition, aligning with enhanced driver assistance and safety paradigms.

In conclusion, while intensely technical, the graduated benefits brought by RESA illustrate a compelling trajectory for researcher and industry stakeholders seeking to augment lane detection precision and efficiency within autonomous driving ecosystems. The sustained state-of-the-art outcomes exhibited by the paper reflect a promising step forward in computer vision applications relevant to intelligent transportation systems.

PDF Markdown

Related Papers

GitHub

GitHub - ZJULearning/resa: Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021. (175 stars)