SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud (1809.08495v1)

Published 22 Sep 2018 in cs.CV

Abstract: Earlier work demonstrates the promise of deep-learning-based approaches for point cloud segmentation; however, these approaches need to be improved to be practically useful. To this end, we introduce a new model SqueezeSegV2 that is more robust to dropout noise in LiDAR point clouds. With improved model structure, training loss, batch normalization and additional input channel, SqueezeSegV2 achieves significant accuracy improvement when trained on real data. Training models for point cloud segmentation requires large amounts of labeled point-cloud data, which is expensive to obtain. To sidestep the cost of collection and annotation, simulators such as GTA-V can be used to create unlimited amounts of labeled, synthetic data. However, due to domain shift, models trained on synthetic data often do not generalize well to the real world. We address this problem with a domain-adaptation training pipeline consisting of three major components: 1) learned intensity rendering, 2) geodesic correlation alignment, and 3) progressive domain calibration. When trained on real data, our new model exhibits segmentation accuracy improvements of 6.0-8.6% over the original SqueezeSeg. When training our new model on synthetic data using the proposed domain adaptation pipeline, we nearly double test accuracy on real-world data, from 29.0% to 57.4%. Our source code and synthetic dataset will be open-sourced.

Authors (5)

Bichen Wu (52 papers)
Xuanyu Zhou (6 papers)
Sicheng Zhao (53 papers)
Xiangyu Yue (93 papers)
Kurt Keutzer (200 papers)

Citations (585)

View on Semantic Scholar

Summary

SqueezeSegV2: Enhanced Road-Object Segmentation from LiDAR Point Clouds

This paper introduces SqueezeSegV2, an advancement over the original SqueezeSeg model, aimed at improving road-object segmentation from LiDAR point clouds. The key enhancements incorporate an improved model structure, batch normalization, and a novel domain adaptation strategy, addressing challenges prevalent in utilizing synthetic data to train models for real-world application.

Improved Model Architecture

SqueezeSegV2 presents several architectural refinements to boost segmentation performance. Notably, the introduction of the Context Aggregation Module (CAM) mitigates the impact of dropout noise, a common issue in LiDAR data due to sensor limitations or environmental conditions. CAM enhances the model's robustness by aggregating contextual information over a large receptive field, thereby improving accuracy by 6.0% to 8.6% across different categories.

Additional modifications include the incorporation of focal loss, which tackles the challenge of class imbalance by focusing more on difficult-to-classify categories such as pedestrians and cyclists. This is complemented by batch normalization and the inclusion of a LiDAR mask channel to delineate missing data points, both contributing to enhanced segmentation accuracy.

Unsupervised Domain Adaptation

A significant focus of this work is to address the domain gap between synthetic and real-world data. This gap often arises when using synthetic datasets like those generated from GTA-V for training, which lack realistic noise patterns and intensity signals. To tackle this, the authors propose a domain adaptation pipeline comprising:

Learned Intensity Rendering: A neural network predicts intensity values for synthetic data, trained in a self-supervised manner on real-world datasets. This approach leverages a hybrid loss function to better capture the distribution of real-world intensity values.
Geodesic Correlation Alignment: During training, this method computes the geodesic distance between the output distributions of synthetic and real data, incorporating this distance into the loss function to align domain representations.
Progressive Domain Calibration (PDC): This post-training process progressively re-calibrates each layer of the network using unlabeled real-world data, ensuring that distribution shifts are constrained. PDC effectively normalizes feature distributions, mitigating accumulated discrepancies across network layers.

By integrating these strategies, the model demonstrates an increase in real-world test accuracy from 29.0% to 57.4%, highlighting its efficacy in domain adaptation.

Implications and Future Directions

SqueezeSegV2 addresses critical issues in real-time, robust perception systems for autonomous vehicles. Its effective use of domain adaptation techniques reveals a substantial potential to leverage synthetic data, reducing the reliance on costly labeled real-world datasets. The introduction of CAM and advanced training techniques could also inform similar enhancements in other deep learning applications encountering domain discrepancies.

Future research could explore further refining these adaptation strategies, perhaps by employing adversarial training methodologies or by dynamically adjusting model parameters based on environmental conditions. Additionally, expanding the variety of simulated environments could enhance model generalization across broader real-world scenarios, thus pushing the capabilities of autonomous vehicle perception to new heights.

PDF Markdown

Related Papers

Find Related Papers