Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation (2204.07733v2)

Published 16 Apr 2022 in cs.CV and cs.AI

Abstract: Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving for its powerful spatial representation ability. It is challenging to estimate the BEV semantic maps from monocular images due to the spatial gap, since it is implicitly required to realize both the perspective-to-BEV transformation and segmentation. We present a novel two-stage Geometry Prior-based Transformation framework named GitNet, consisting of (i) the geometry-guided pre-alignment and (ii) ray-based transformer. In the first stage, we decouple the BEV segmentation into the perspective image segmentation and geometric prior-based mapping, with explicit supervision by projecting the BEV semantic labels onto the image plane to learn visibility-aware features and learnable geometry to translate into BEV space. Second, the pre-aligned coarse BEV features are further deformed by ray-based transformers to take visibility knowledge into account. GitNet achieves the leading performance on the challenging nuScenes and Argoverse Datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shi Gong (6 papers)
  2. Xiaoqing Ye (42 papers)
  3. Xiao Tan (75 papers)
  4. Jingdong Wang (236 papers)
  5. Errui Ding (156 papers)
  6. Yu Zhou (335 papers)
  7. Xiang Bai (222 papers)
Citations (28)

Summary

We haven't generated a summary for this paper yet.