Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation (2401.11395v3)

Published 21 Jan 2024 in cs.CV

Abstract: 3D open-vocabulary scene understanding aims to recognize arbitrary novel categories beyond the base label space. However, existing works not only fail to fully utilize all the available modal information in the 3D domain but also lack sufficient granularity in representing the features of each modality. In this paper, we propose a unified multimodal 3D open-vocabulary scene understanding network, namely UniM-OV3D, which aligns point clouds with image, language and depth. To better integrate global and local features of the point clouds, we design a hierarchical point cloud feature extraction module that learns comprehensive fine-grained feature representations. Further, to facilitate the learning of coarse-to-fine point-semantic representations from captions, we propose the utilization of hierarchical 3D caption pairs, capitalizing on geometric constraints across various viewpoints of 3D scenes. Extensive experimental results demonstrate the effectiveness and superiority of our method in open-vocabulary semantic and instance segmentation, which achieves state-of-the-art performance on both indoor and outdoor benchmarks such as ScanNet, ScanNet200, S3IDS and nuScenes. Code is available at https://github.com/hithqd/UniM-OV3D.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Qingdong He (23 papers)
  2. Jinlong Peng (34 papers)
  3. Zhengkai Jiang (42 papers)
  4. Kai Wu (134 papers)
  5. Xiaozhong Ji (16 papers)
  6. Jiangning Zhang (102 papers)
  7. Yabiao Wang (93 papers)
  8. Chengjie Wang (178 papers)
  9. Mingang Chen (5 papers)
  10. Yunsheng Wu (25 papers)
Citations (4)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub