WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving (1905.01489v3)

Published 4 May 2019 in cs.CV, cs.AI, cs.LG, cs.RO, and stat.ML

Abstract: Fisheye cameras are commonly employed for obtaining a large field of view in surveillance, augmented reality and in particular automotive applications. In spite of their prevalence, there are few public datasets for detailed evaluation of computer vision algorithms on fisheye images. We release the first extensive fisheye automotive dataset, WoodScape, named after Robert Wood who invented the fisheye camera in 1906. WoodScape comprises of four surround view cameras and nine tasks including segmentation, depth estimation, 3D bounding box detection and soiling detection. Semantic annotation of 40 classes at the instance level is provided for over 10,000 images and annotation for other tasks are provided for over 100,000 images. With WoodScape, we would like to encourage the community to adapt computer vision models for fisheye camera instead of using naive rectification.

Citations (241)

View on Semantic Scholar

Summary

The paper introduces WoodScape as the first fisheye camera dataset tailored for autonomous driving with extensive multi-task annotations.
It employs a sensor suite of four fisheye cameras, LiDAR, IMU, and GNSS to capture 360-degree real-world driving scenarios and complex distortion models.
Baseline analyses in segmentation, depth estimation, and 3D bounding box detection demonstrate promising performance, encouraging unified multi-task model approaches.

Overview of WoodScape: A Multi-Task, Multi-Camera Fisheye Dataset for Autonomous Driving

The paper presents WoodScape, an innovative dataset tailored for the emerging needs in autonomous driving systems, particularly focusing on the use of fisheye cameras. It aims to bridge a considerable gap in existing datasets by offering a comprehensive range of annotated tasks on fisheye imagery, a staple in modern automotive camera systems due to their extensive field of view (FOV).

Significance and Contributions

WoodScape is the first dataset dedicated to fisheye cameras in autonomous driving, encompassing:

Dataset Composition: Comprising over 10,000 images with instance-level semantic annotations and over 100,000 images annotated for various tasks.
Sensor Configuration: Utilizes four surrounding view fisheye cameras, capturing a complete 360-degree environment, along with a LiDAR sensor, IMU, and GNSS, to provide accurate ground truth data.
Multi-Task Annotations: Covers nine tasks, including segmentation, depth estimation, 3D bounding box detection, and a novel task—soiling detection.
Unified Learning Models: Encourages research on deploying multi-camera and multi-task models inherently, eschewing traditional image rectification methods often used to adapt fisheye lenses' nonlinear data.

Dataset Design and Acquisition

The dataset was gathered from diverse geographical regions and vehicles to ensure broad environmental variability. It contains semantic segmentation data for 40 classes. The fisheye camera model is presented in detail, offering intrinsic calibration independent of typical projection methods, which is crucial for capturing the complex distortion characteristics inherent to fisheye imagery.

Analytical Insight into Fisheye Camera Models

Fisheye lenses pose a unique challenge due to their radial distortion, which standard models fail to address effectively. The paper details several fisheye models, illustrating the distortions through comparative metrics. The authors encourage model adaptation over undistortion strategies—where image integrity can be compromised—stressing algorithmic refinements that respect the fisheye geometry.

Descriptive Analysis of Tasks and Baseline Results

The paper undertakes a comprehensive exploration of individual tasks supported by the WoodScape dataset:

Semantic Segmentation: Employing models like ENet to handle the distinctive characteristics of fisheye imagery and presenting baseline results in IoU metrics.
3D Bounding Box Detection: Introducing a novel metric, SRT (Scaling-Rotation-Translation), to handle the orientation challenges posed by distorted perspectives.
Specialized Tasks: The inclusion of soiling detection is compelling for autonomous systems, offering insights into practical deployment challenges in car cameras exposed to adverse conditions.
Advanced Vision Tasks: Utilizes approaches such as monocular depth estimation and visual odometry/SLAM explicit in fisheye contexts.

Implications and Future Directions

WoodScape provides a significant leap for the automotive vision community by facilitating models trained specifically on fisheye data. The implications are vast for developing advanced driver-assistance systems with improved situational awareness from a panoramic perspective. Future work is expected to delve into integrated multi-task models, combining the wealth of information from different sensing modalities to drive innovation in fully autonomous vehicles.

This paper represents a pivotal reference for researchers aiming to harness the untapped potential of fisheye lens data, providing practical insights into algorithm development along with a robust dataset to fuel further advances in autonomous driving technologies.

PDF Markdown