WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces (2307.06505v3)

Published 13 Jul 2023 in cs.CV and cs.RO

Abstract: Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io.

Authors (14)

Shanliang Yao (14 papers)
Runwei Guan (25 papers)
Zhaodong Wu (4 papers)
Yi Ni (32 papers)
Zile Huang (3 papers)
Yong Yue (14 papers)
Weiping Ding (53 papers)
Eng Gee Lim (38 papers)
Hyungjoon Seo (4 papers)
Ka Lok Man (17 papers)
Xiaohui Zhu (15 papers)
Yutao Yue (52 papers)
Ryan Wen Liu (28 papers)
Jieming Ma (7 papers)

Citations (14)

View on Semantic Scholar

Summary

The paper introduces the first multi-task 4D radar-camera fusion dataset for autonomous water surface navigation with detailed pixel and point annotations.
The paper demonstrates that integrating radar with camera data significantly boosts perception accuracy, particularly in challenging weather and lighting conditions.
The research benchmarks fusion against single-modality approaches, highlighting its superior robustness and potential to improve USV safety and performance.

WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmark for Autonomous Driving on Water Surfaces

The paper "WaterScenes" introduces an innovative dataset designed to advance research in autonomous surface vehicles (USVs) operating on water surfaces. This dataset is the first of its kind to provide a multi-task, 4D radar-camera fusion benchmark tailored for autonomous driving on aquatic environments. It is particularly relevant for tasks including maritime surveillance, environmental monitoring, hydrography mapping, and disaster response.

WaterScenes capitalizes on the fusion of 4D radar and monocular camera data to provide comprehensive perception capabilities for USVs. The dataset is meticulously curated to deliver annotations at both pixel-level for images and point-level for radar data. These annotations cover a range of perception tasks: object detection, instance segmentation, semantic segmentation, free-space segmentation, and waterline segmentation. This multi-faceted approach offers a robust dataset that encapsulates diverse environmental conditions, such as lighting and weather variations, enabling the development of robust algorithms.

Key results presented in the paper illustrate that the fusion of radar and camera modalities markedly enhances perception accuracy. Specifically, experiments show that fusion significantly bolsters performance in adverse conditions where standalone sensor modalities often falter. The radar data augments the camera data by providing critical range and velocity information, which is less impacted by adverse environmental conditions, such as rain or fog.

The paper also explores comparison benchmarks utilizing varied sensor modalities (camera only, radar only, and fused data), demonstrating the superior performance of the fused approach. While camera-based systems are susceptible to environmental occlusions and lighting changes, radar’s robustness in detecting objects regardless of visibility conditions fills these gaps, resulting in a more resilient perception system.

The paper suggests that insights gleaned from this dataset may catalyze advancements in the design of perception systems that are robust to the nuanced challenges of aquatic environments. It proposes future exploration into refined fusion techniques to further leverage the synergy of radar and camera data. Expanding on this research could lead to significant improvements in model reliability and deployment readiness, enhancing the safety and effectiveness of USVs in their widening array of applications.

WaterScenes represents a pivotal step towards mitigating the traditional challenges of USV deployment in dynamic and unpredictable water conditions, encouraging the development of more sophisticated fusion-based perception models. This contribution creates a foundation upon which future developments in autonomous maritime navigation and environmental interaction will be built. The accessibility of this dataset to the broader research community is expected to stimulate further innovations and collaborations in the domain of autonomous marine vehicles.

PDF Markdown

Related Papers

GitHub

GitHub - WaterScenes/WaterScenes: Official repository for WaterScenes dataset (141 stars)