Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation (2202.13377v3)

Published 27 Feb 2022 in cs.CV and cs.RO

Abstract: LiDAR sensor is essential to the perception system in autonomous vehicles and intelligent robots. To fulfill the real-time requirements in real-world applications, it is necessary to efficiently segment the LiDAR scans. Most of previous approaches directly project 3D point cloud onto the 2D spherical range image so that they can make use of the efficient 2D convolutional operations for image segmentation. Although having achieved the encouraging results, the neighborhood information is not well-preserved in the spherical projection. Moreover, the temporal information is not taken into consideration in the single scan segmentation task. To tackle these problems, we propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg, where a new range residual image representation is introduced to capture the spatial-temporal information. Specifically, Meta-Kernel is employed to extract the meta features, which reduces the inconsistency between the 2D range image coordinates input and 3D Cartesian coordinates output. An efficient U-Net backbone is used to obtain the multi-scale features. Furthermore, Feature Aggregation Module (FAM) strengthens the role of range channel and aggregates features at different levels. We have conducted extensive experiments for performance evaluation on SemanticKITTI and SemanticPOSS. The promising results show that our proposed Meta-RangeSeg method is more efficient and effective than the existing approaches. Our full implementation is publicly available at https://github.com/songw-zju/Meta-RangeSeg .

Citations (29)

View on Semantic Scholar

Summary

The paper introduces a novel approach using a range residual image and dynamic meta-kernel to capture detailed spatial-temporal features in LiDAR data.
It demonstrates superior performance with state-of-the-art mean IoU scores and processes data at 22 frames per second on benchmark datasets.
The methodology paves the way for improved real-time segmentation in autonomous driving and robotics, enhancing both accuracy and efficiency.

An Overview of "Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation"

Meta-RangeSeg provides a new approach to semantic segmentation within LiDAR sensor data utilizing multiple feature aggregation. The proposed paper aims to address prevalent issues related to traditional methods of segmenting 3D point clouds by optimizing the handling of neighborhood information and temporal sequences. By introducing a range residual image for spatial-temporal information capture, Meta-RangeSeg advances LiDAR sequence semantic segmentation to fulfill real-time requirements necessary for current autonomous vehicles and intelligent robots.

Methodology

The paper proposes Meta-RangeSeg, a novel approach using a range residual image as opposed to direct fusion methods, which encapsulates both spatial and temporal information more effectively. Key components include:

Range Residual Image: This feature captures temporal variance between LiDAR scans through a 9-channel image that includes range differences across scans, enabling multi-frame temporal information to be represented efficiently.
Meta-Kernel for Feature Extraction: The Meta-Kernel dynamically learns weights from relative Cartesian coordinates and range values to derive meta features. These address inconsistencies between 2D range inputs and 3D Cartesian outputs.
U-Net Backbone: Employing a U-Net architecture allows the extraction of multi-scale features essential for segmenting the LiDAR data effectively.
Feature Aggregation Module (FAM): The FAM integrates meta features with range-based context to aggregate information across different scales, remarkably utilizing the prominence of range channels.

Experimental Design and Results

In evaluating Meta-RangeSeg, the authors employed SemanticKITTI and SemanticPOSS datasets, prevalent benchmarks for LiDAR segmentation. The method not only demonstrates superior accuracy over previous state-of-the-art methods but also runs at speeds up to 22 frames per second, showcasing its practical effectiveness in real-time applications. Notably, the full Meta-RangeSeg implementation consistently achieved better mean intersection-over-union (mIoU) scores in both multiple and single scan scenarios.

Implications and Future Directions

The implications of Meta-RangeSeg extend beyond theoretical value, potentially impacting practical applications in autonomous driving systems and other robotics fields that rely heavily on efficient and accurate LiDAR perception. The development of the range residual image and Meta-Kernel suggests a novel direction for LiDAR-based research, with potential extensions to optimize further computational efficiency and reduce memory consumption.

Moving forward, enhancing the Meta-RangeSeg framework by fully optimizing module interconnections and further minimizing computational costs presents a promising avenue for further research. The seamless integration of additional modalities and deeper exploration into the intricacies of temporal sequence handling could potentially open up new frontiers in the evolving landscape of autonomous vehicle technologies and intelligent robotic systems.

PDF Markdown

Related Papers

GitHub

GitHub - songw-zju/Meta-RangeSeg: The official implementation of "Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation" (RA-L with IROS 2022) (46 stars)

YouTube

Show All Videos