Find Your Own Way: Weakly-Supervised Segmentation of Path Proposals for Urban Autonomy (1610.01238v3)

Published 5 Oct 2016 in cs.RO, cs.AI, cs.CV, and cs.LG

Abstract: We present a weakly-supervised approach to segmenting proposed drivable paths in images with the goal of autonomous driving in complex urban environments. Using recorded routes from a data collection vehicle, our proposed method generates vast quantities of labelled images containing proposed paths and obstacles without requiring manual annotation, which we then use to train a deep semantic segmentation network. With the trained network we can segment proposed paths and obstacles at run-time using a vehicle equipped with only a monocular camera without relying on explicit modelling of road or lane markings. We evaluate our method on the large-scale KITTI and Oxford RobotCar datasets and demonstrate reliable path proposal and obstacle segmentation in a wide variety of environments under a range of lighting, weather and traffic conditions. We illustrate how the method can generalise to multiple path proposals at intersections and outline plans to incorporate the system into a framework for autonomous urban driving.

Citations (120)

View on Semantic Scholar

Summary

The paper introduces a semi-supervised segmentation method that leverages sensor fusion to create path and obstacle labels with minimal manual annotation.
It employs a monocular camera combined with odometry and LIDAR data to project future vehicular paths in complex urban scenes.
Evaluations on KITTI and Oxford datasets demonstrate robust performance with IoU scores exceeding 80% under varied lighting and weather conditions.

Semi-Supervised Path Proposal Segmentation for Autonomous Driving

This paper presents a novel semi-supervised approach for segmenting drivable paths and obstacles in images, specifically aimed at enhancing autonomous driving within complex urban environments. The approach leverages vast amounts of image data from real-world driving scenarios without the necessity for manual annotation. Key elements of this method include the use of a monocular camera system and additional sensory data collected during typical vehicle operation.

Methodology

The methodology employed involves two primary components: data collection and labeling, and model training. The authors utilize a vehicle equipped with a monocular camera and sensors capable of providing odometry and obstacle detection data. These additional sensors, while not required during the real-time application of the trained model, are crucial for establishing the semi-supervised training data. This approach is detailed as follows:

Sensor Configuration and Data Collection: A monocular camera is paired with odometry sensors and a LIDAR scanner. The vehicle captures images continuously and estimates its motion, while the LIDAR system identifies obstacles.
Label Generation: Using the collected data, the future path of the vehicle is projected into current images, creating proposed path labels. Simultaneously, obstacle labels are generated by projecting 3D points from the LIDAR onto image frames and marking these points as obstacles.
Network Training: The labeled data train a deep semantic segmentation network, specifically the SegNet architecture. This procedure circumvents the need for extensive manual labeling, enabling the creation of a substantial dataset by relying on vehicle navigation patterns and environmental perception.

Evaluation and Results

The evaluation was conducted using two prominent datasets: the KITTI dataset and the Oxford RobotCar dataset. These datasets provide diverse environmental settings, including different weather and lighting conditions. The method was assessed based on its ability to generalize across different scenes and conditions rather than specific road markings or predetermined lanes.

Oxford Dataset: The trained model demonstrated high accuracy levels in diverse conditions, achieving Intersection over Union (IoU) scores surpassing 80% across different lighting and weather conditions. This robustness underscores the potential for the model to function reliably across typical driving scenarios.
KITTI Benchmarks: For the KITTI dataset, the model was tested for ego-lane segmentation and obstacle detection. Though the task definitions vary slightly from the primary focus of the research, the model performed admirably, illustrating viability in real-world driving applications.

Implications and Future Directions

The implications of this research are multifaceted. Practically, the reduction in reliance on extensive manual annotation streamlines the development of robust autonomous driving systems, potentially accelerating deployment in urban areas. Theoretically, the work contributes to the understanding of semi-supervised learning methods in complex environments, particularly within the field of autonomous navigation.

The paper suggests future endeavors could focus on integrating this semi-supervised segmentation system into a broader autonomous driving framework. This integration would encompass decision-making strategies for navigation, further refinement of path proposals at intersections, and enhanced interaction with static and dynamic obstacles by considering real-time sensory input.

This research signifies an important step towards scalable and adaptable autonomous driving systems, offering a means to efficiently and accurately perceive complex road environments. The potential for integrating such models with existing navigation and planning methodologies could propel advancements in the field of autonomous vehicle technology.

PDF Markdown

Related Papers

YouTube

Show All Videos