P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching (2103.01055v2)
Abstract: Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds. Despite a plethora of learning-based 2D or 3D local feature descriptors and detectors having been proposed, the derivation of a shared descriptor and joint keypoint detector that directly matches pixels and points remains under-explored by the community. This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds. In order to directly match pixels and points, a dual fully convolutional framework is presented that maps 2D and 3D inputs into a shared latent representation space to simultaneously describe and detect keypoints. Furthermore, an ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions. Extensive experimental results demonstrate that our framework shows competitive performance in fine-grained matching between images and point clouds and achieves state-of-the-art results for the task of indoor visual localization. Our source code will be available at [no-name-for-blind-review].
- Bing Wang (246 papers)
- Changhao Chen (64 papers)
- Zhaopeng Cui (64 papers)
- Jie Qin (68 papers)
- Chris Xiaoxuan Lu (50 papers)
- Zhengdi Yu (8 papers)
- Peijun Zhao (12 papers)
- Zhen Dong (87 papers)
- Fan Zhu (44 papers)
- Niki Trigoni (86 papers)
- Andrew Markham (94 papers)