Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Part-level Car Parsing and Reconstruction from Single Street View (1811.10837v2)

Published 27 Nov 2018 in cs.CV

Abstract: Part information has been shown to be resistant to occlusions and viewpoint changes, which is beneficial for various vision-related tasks. However, we found very limited work in car pose estimation and reconstruction from street views leveraging the part information. There are two major contributions in this paper. Firstly, we make the first attempt to build a framework to simultaneously estimate shape, translation, orientation, and semantic parts of cars in 3D space from a single street view. As it is labor-intensive to annotate semantic parts on real street views, we propose a specific approach to implicitly transfer part features from synthesized images to real street views. For pose and shape estimation, we propose a novel network structure that utilizes both part features and 3D losses. Secondly, we are the first to construct a high-quality dataset that contains 348 different car models with physical dimensions and part-level annotations based on global and local deformations. Given these models, we further generate 60K synthesized images with randomization of orientation, illumination, occlusion, and texture. Our results demonstrate that our part segmentation performance is significantly improved after applying our implicit transfer approach. Our network for pose and shape estimation achieves the state-of-the-art performance on the ApolloCar3D dataset and outperforms 3D-RCNN and DeepMANTA by 12.57 and 8.91 percentage points in terms of mean A3DP-Abs.

Citations (5)

Summary

We haven't generated a summary for this paper yet.