Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training (2309.16956v1)

Published 29 Sep 2023 in cs.CV

Abstract: Current successful methods of 3D scene perception rely on the large-scale annotated point cloud, which is tedious and expensive to acquire. In this paper, we propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages. The main challenges are the domain gaps between the CAD models and the real scene's objects, including model-to-scene (from a single model to the scene) and synthetic-to-real (from synthetic model to real scene's object). To handle the above challenges, Model2Scene first simulates a crowded scene by mixing data-augmented CAD models. Next, we propose a novel feature regularization operation, termed Deep Convex-hull Regularization (DCR), to project point features into a unified convex hull space, reducing the domain gap. Ultimately, we impose contrastive loss on language embedding and the point features of CAD models to pre-train the 3D network. Extensive experiments verify the learned 3D scene representation is beneficial for various downstream tasks, including label-free 3D object salient detection, label-efficient 3D scene perception and zero-shot 3D semantic segmentation. Notably, Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08\% and 55.49\% on the ScanNet and S3DIS datasets, respectively. The code will be publicly available.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (9)

Runnan Chen (32 papers)
Xinge Zhu (62 papers)
Nenglun Chen (17 papers)
Dawei Wang (49 papers)
Wei Li (1122 papers)
Yuexin Ma (97 papers)
Ruigang Yang (68 papers)
Tongliang Liu (251 papers)
Wenping Wang (184 papers)

Citations (3)

View on Semantic Scholar

Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training (2309.16956v1)

Related Papers