Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UniOcc: Unifying Vision-Centric 3D Occupancy Prediction with Geometric and Semantic Rendering (2306.09117v1)

Published 15 Jun 2023 in cs.CV and cs.AI

Abstract: In this technical report, we present our solution, named UniOCC, for the Vision-Centric 3D occupancy prediction track in the nuScenes Open Dataset Challenge at CVPR 2023. Existing methods for occupancy prediction primarily focus on optimizing projected features on 3D volume space using 3D occupancy labels. However, the generation process of these labels is complex and expensive (relying on 3D semantic annotations), and limited by voxel resolution, they cannot provide fine-grained spatial semantics. To address this limitation, we propose a novel Unifying Occupancy (UniOcc) prediction method, explicitly imposing spatial geometry constraint and complementing fine-grained semantic supervision through volume ray rendering. Our method significantly enhances model performance and demonstrates promising potential in reducing human annotation costs. Given the laborious nature of annotating 3D occupancy, we further introduce a Depth-aware Teacher Student (DTS) framework to enhance prediction accuracy using unlabeled data. Our solution achieves 51.27\% mIoU on the official leaderboard with single model, placing 3rd in this challenge.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Mingjie Pan (8 papers)
  2. Li Liu (311 papers)
  3. Jiaming Liu (156 papers)
  4. Peixiang Huang (11 papers)
  5. Longlong Wang (5 papers)
  6. Shanghang Zhang (173 papers)
  7. Shaoqing Xu (11 papers)
  8. Zhiyi Lai (1 paper)
  9. Kuiyuan Yang (20 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.