JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds (1912.09654v1)

Published 20 Dec 2019 in cs.CV

Abstract: In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously. Firstly, we build an effective backbone network to extract robust features from the raw point clouds. Secondly, to obtain more discriminative features, a point cloud feature fusion module is proposed to fuse the different layer features of the backbone network. Furthermore, a joint instance semantic segmentation module is developed to transform semantic features into instance embedding space, and then the transformed features are further fused with instance features to facilitate instance segmentation. Meanwhile, this module also aggregates instance features into semantic feature space to promote semantic segmentation. Finally, the instance predictions are generated by applying a simple mean-shift clustering on instance embeddings. As a result, we evaluate the proposed JSNet on a large-scale 3D indoor point cloud dataset S3DIS and a part dataset ShapeNet, and compare it with existing approaches. Experimental results demonstrate our approach outperforms the state-of-the-art method in 3D instance segmentation with a significant improvement in 3D semantic prediction and our method is also beneficial for part segmentation. The source code for this work is available at https://github.com/dlinzhao/JSNet.

Citations (103)

View on Semantic Scholar

Summary

The paper introduces JSNet, which simultaneously enhances instance and semantic segmentation for 3D point clouds.
It employs a backbone network and a feature fusion module to improve performance, achieving gains of 4.1 mCov and 6.8 mPrec on challenging segments.
Mean-shift clustering is used for instance prediction, demonstrating superior results on datasets like S3DIS and ShapeNet.

Overview of JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds

The publication under review introduces JSNet, a novel methodology designed to tackle the simultaneous tasks of instance and semantic segmentation of 3D point clouds. Authored by Lin Zhao and Wenbing Tao, the paper presents a comprehensive approach integrating advanced neural architectures to enhance the segmentation capabilities in 3D environments. The challenges associated with processing 3D point clouds, such as large-scale noisy data processing and substantial computational demands, are addressed through strategic innovations in neural network architecture and feature fusion techniques.

Key Contributions

Backbone Network and Feature Fusion: The authors propose an effective backbone network to extract robust features from raw 3D point clouds. This is complemented by a Point Cloud Feature Fusion (PCFF) module designed to aggregate and enhance features across different network layers, leading to improved discriminative power for both semantic and instance segmentation tasks.
Joint Instance and Semantic Segmentation Module: A distinctive feature, the Joint Instance and Semantic Segmentation (JISS) module, is introduced. This module facilitates cross-domain influence where instance and semantic features are mutually enhanced. The module achieves this by transforming semantic features into instance embedding spaces and vice versa, ensuring a synergistic improvement in segmentations.
Mean-Shift Clustering for Instance Prediction: Instance segmentation is realized through mean-shift clustering applied to the generated instance embeddings. This technique enables the network to effectively delineate between individual instances in a 3D space.

Experimental Validation

The authors validate their model on two significant datasets, namely the Stanford Large-Scale 3D Indoor Spaces (S3DIS) and ShapeNet. Performance is assessed across various metrics including mean precision (mPrec), mean recall (mRec), and mean IoU (mIoU), providing a comprehensive evaluation of the model's segmentation prowess.

S3DIS Dataset: JSNet achieves enhancements in instance segmentation metrics over existing methods, such as ASIS and 3D-BoNet, with notable gains on Area 5, a challenging segment of the dataset due to its distinct spatial characteristics. Specifically, JSNet records an increase of 4.1 mCov and 6.8 mPrec compared to ASIS.
ShapeNet Dataset: In the context of semantic segmentation on ShapeNet, JSNet demonstrates superior performance with a marked increase in accuracy over the baseline PointNet++ method.

Implications and Future Work

The research presents significant implications for the application of AI in real-world tasks such as autonomous navigation and robotic perception where accurate 3D environmental mapping is crucial. Practically, the integration of semantic and instance segmentation within a singular framework like JSNet can streamline computational processes and improve efficiency.

Theoretically, this work contributes to the ongoing research into neural network architectures that effectively utilize hierarchical and multi-scale features for complex tasks. Future research could explore the incorporation of additional spatial and geometric features into JSNet, or investigate further optimization of the joint segmentation module to reduce computational overhead while maintaining high precision.

In conclusion, JSNet offers substantial advancements in the domain of 3D point cloud segmentation, with robust experimental verification highlighting its capacity to outperform current state-of-the-art approaches in multiple metrics. This research solidifies the foundation upon which future enhancements in 3D segmentation methodologies can be built, particularly those that need to operate efficiently within complex, real-world environments.

PDF Markdown

Related Papers

GitHub

GitHub - dlinzhao/JSNet: JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds, AAAI2020 (102 stars)