Leveraging Single-View Images for Unsupervised 3D Point Cloud Completion (2212.00564v3)
Abstract: Point clouds captured by scanning devices are often incomplete due to occlusion. To overcome this limitation, point cloud completion methods have been developed to predict the complete shape of an object based on its partial input. These methods can be broadly classified as supervised or unsupervised. However, both categories require a large number of 3D complete point clouds, which may be difficult to capture. In this paper, we propose Cross-PCC, an unsupervised point cloud completion method without requiring any 3D complete point clouds. We only utilize 2D images of the complete objects, which are easier to capture than 3D complete and clean point clouds. Specifically, to take advantage of the complementary information from 2D images, we use a single-view RGB image to extract 2D features and design a fusion module to fuse the 2D and 3D features extracted from the partial point cloud. To guide the shape of predicted point clouds, we project the predicted points of the object to the 2D plane and use the foreground pixels of its silhouette maps to constrain the position of the projected points. To reduce the outliers of the predicted point clouds, we propose a view calibrator to move the points projected to the background into the foreground by the single-view silhouette image. To the best of our knowledge, our approach is the first point cloud completion method that does not require any 3D supervision. The experimental results of our method are superior to those of the state-of-the-art unsupervised methods by a large margin. Moreover, our method even achieves comparable performance to some supervised methods. We will make the source code publicly available at https://github.com/ltwu6/cross-pcc.
- S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kitsukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware on board: Enabling autonomous vehicles with embedded systems,” in Proc. Int. Conf. Cyber-Phys. Syst., 2018, pp. 287–296.
- Y. Cui, R. Chen, W. Chu, L. Chen, D. Tian, Y. Li, and D. Cao, “Deep learning for image and point cloud fusion in autonomous driving: A review,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 722–739, 2021.
- H. Liu, Y. Guo, Y. Ma, Y. Lei, and G. Wen, “Semantic context encoding for accurate 3d point cloud segmentation,” IEEE Trans. Multimedia, vol. 23, pp. 2045–2055, 2020.
- H. Zhu, J. Deng, Y. Zhang, J. Ji, Q. Mao, H. Li, and Y. Zhang, “Vpfnet: Improving 3d object detection with virtual point based lidar and stereo data fusion,” IEEE Trans. Multimedia, 2022.
- Y. Zhang, Q. Zhang, J. Hou, Y. Yuan, and G. Xing, “Unleash the potential of image branch for cross-modal 3d object detection,” in Proc. Adv. Neural Inf. Process. Syst., 2023.
- Q. Zhang, J. Hou, and Y. Qian, “Pointmcd: Boosting deep point cloud encoders via multi-view cross-modal distillation for 3d shape recognition,” IEEE Trans. Multimedia, 2023.
- Y. Wu, D. Kong, S. Wang, J. Li, and B. Yin, “An unsupervised real-time framework of human pose tracking from range image sequences,” IEEE Trans. Multimedia, vol. 22, no. 8, pp. 2177–2190, 2019.
- J. Liu, J. Guo, and D. Xu, “Geometrymotion-transformer: An end-to-end framework for 3d action recognition,” IEEE Trans. Multimedia, 2022.
- Z. Wang, L. Zhang, Y. Shen, and Y. Zhou, “D-liom: Tightly-coupled direct lidar-inertial odometry and mapping,” IEEE Trans. Multimedia, 2022.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 652–660.
- C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 5105–5114.
- Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen, “Pointcnn: Convolution on χ𝜒\chiitalic_χ-transformed points,” in Proc. Adv. Neural Inf. Process. Syst., 2018, pp. 828–838.
- H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 6411–6420.
- Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” ACM Trans. Graph., vol. 38, no. 5, pp. 1–12, 2019.
- Q. Zhang, J. Hou, Y. Qian, A. B. Chan, J. Zhang, and Y. He, “Reggeonet: Learning regular representations for large-scale 3d point clouds,” Int. J. Comput. Vision, vol. 130, no. 12, pp. 3100–3122, 2022.
- Q. Zhang, J. Hou, Y. Qian, Y. Zeng, J. Zhang, and Y. He, “Flattening-net: Deep regular 2d representation for 3d point cloud analysis,” IEEE Trans. Pattern Anal. Mach. Intell., 2023.
- W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “PCN: point completion network,” in Int. Conf. 3D Vis., 2018, pp. 728–737.
- M. Liu, L. Sheng, S. Yang, J. Shao, and S. Hu, “Morphing and sampling network for dense point cloud completion,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 11 596–11 603.
- X. Yu, Y. Rao, Z. Wang, Z. Liu, J. Lu, and J. Zhou, “Pointr: Diverse point cloud completion with geometry-aware transformers,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 12 478–12 487.
- X. Wen, T. Li, Z. Han, and Y. Liu, “Point cloud completion by skip-attention network with hierarchical folding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 1936–1945.
- H. Xie, H. Yao, S. Zhou, J. Mao, S. Zhang, and W. Sun, “Grnet: Gridding residual network for dense point cloud completion,” in Proc. European Conf. Comput. Vis., vol. 12354, 2020, pp. 365–381.
- X. Zhang, Y. Feng, S. Li, C. Zou, H. Wan, X. Zhao, Y. Guo, and Y. Gao, “View-guided point cloud completion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 15 890–15 899.
- X. Chen, B. Chen, and N. J. Mitra, “Unpaired point cloud completion on real scans using adversarial training,” in Proc. Int. Conf. Learn. Represent., 2020.
- X. Wen, Z. Han, Y. Cao, P. Wan, W. Zheng, and Y. Liu, “Cycle4completion: Unpaired point cloud completion using cycle transformation with missing region coding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 13 080–13 089.
- J. Zhang, X. Chen, Z. Cai, L. Pan, H. Zhao, S. Yi, C. K. Yeo, B. Dai, and C. C. Loy, “Unsupervised 3d shape completion through GAN inversion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 1768–1777.
- Y. Cai, K.-Y. Lin, C. Zhang, Q. Wang, X. Wang, and H. Li, “Learning a structured latent space for unsupervised point cloud completion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 5543–5553.
- R. Wu, X. Chen, Y. Zhuang, and B. Chen, “Multimodal shape completion via conditional generative adversarial networks,” in Proc. European Conf. Comput. Vis., 2020, pp. 281–296.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. Adv. Neural Inf. Process. Syst., vol. 27, 2014, pp. 5543–5553.
- P. Xiang, X. Wen, Y. Liu, Y. Cao, P. Wan, W. Zheng, and Z. Han, “Snowflakenet: Point cloud completion by snowflake point deconvolution with skip-transformer,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 5479–5489.
- Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph CNN for learning on point clouds,” ACM Trans. Graph., vol. 38, no. 5, pp. 146:1–146:12, 2019.
- C. Chen, Z. Han, Y. Liu, and M. Zwicker, “Unsupervised learning of fine structure generation for 3d point clouds by 2d projection matching,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 12 446–12 457.
- A. Dai, C. R. Qi, and M. Nießner, “Shape completion using 3d-encoder-predictor cnns and shape synthesis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 6545–6554.
- X. Han, Z. Li, H. Huang, E. Kalogerakis, and Y. Yu, “High-resolution shape completion using deep neural networks for global structure and local geometry inference,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 85–93.
- D. Stutz and A. Geiger, “Learning 3d shape completion from laser scan data with weak supervision,” in Proc. IEEE Int. Conf. Comput. Vis., 2018, pp. 1955–1964.
- Y. Yang, C. Feng, Y. Shen, and D. Tian, “Foldingnet: Point cloud auto-encoder via deep grid deformation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 206–215.
- T. Groueix, M. Fisher, V. G. Kim, B. C. Russell, and M. Aubry, “A papier-mâché approach to learning 3d surface generation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 216–224.
- Y. Tang, Y. Qian, Q. Zhang, Y. Zeng, J. Hou, and X. Zhe, “Warpinggan: Warping multiple uniform priors for adversarial 3d point cloud generation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 6397–6405.
- J. Yang, P. Ahn, D. Kim, H. Lee, and J. Kim, “Progressive seed generation auto-encoder for unsupervised point cloud learning,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 6393–6402.
- J. Pang, D. Li, and D. Tian, “Tearingnet: Point cloud autoencoder to learn topology-friendly representations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 7453–7462.
- L. P. Tchapmi, V. Kosaraju, H. Rezatofighi, I. D. Reid, and S. Savarese, “Topnet: Structural point cloud decoder,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 383–392.
- Y. Li, A. W. Yu, T. Meng, B. Caine, J. Ngiam, D. Peng, J. Shen, Y. Lu, D. Zhou, Q. V. Le, A. Yuille, and M. Tan, “Deepfusion: Lidar-camera deep fusion for multimodal 3d object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 17 182–17 191.
- P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen, and D. Anguelov, “Scalability in perception for autonomous driving: Waymo open dataset,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 2443–2451.
- C. Wang, C. Ma, M. Zhu, and X. Yang, “Pointaugmenting: Cross-modal augmentation for 3d object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 11 794–11 803.
- B. Gong, Y. Nie, Y. Lin, X. Han, and Y. Yu, “ME-PCN: point completion conditioned on mask emptiness,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 12 468–12 477.
- Z. Zhu, L. Nan, H. Xie, H. Chen, J. Wang, M. Wei, and J. Qin, “Csdn: Cross-modal shape-transfer dual-refinement network for point cloud completion,” IEEE Trans. Vis. Comput. Graph., 2023.
- E. Aiello, D. Valsesia, and E. Magli, “Cross-modal learning for image-guided point cloud shape completion,” Proc. Adv. Neural Inf. Process. Syst., vol. 35, pp. 37 349–37 362, 2022.
- H. Zhao, L. Jiang, J. Jia, P. H. Torr, and V. Koltun, “Point transformer,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 16 259–16 268.
- X. Pan, Z. Xia, S. Song, L. E. Li, and G. Huang, “3d object detection with pointformer,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 7463–7472.
- M.-H. Guo, J.-X. Cai, Z.-N. Liu, T.-J. Mu, R. R. Martin, and S.-M. Hu, “Pct: Point cloud transformer,” Comput. Visual Media, vol. 7, no. 2, pp. 187–199, 2021.
- H. Kato, D. Beker, M. Morariu, T. Ando, T. Matsuoka, W. Kehl, and A. Gaidon, “Differentiable rendering: A survey,” CoRR, vol. abs/2006.12057, 2020.
- O. Wiles, G. Gkioxari, R. Szeliski, and J. Johnson, “Synsin: End-to-end view synthesis from a single image,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 7465–7475.
- C. Lin, C. Kong, and S. Lucey, “Learning efficient point cloud generation for dense 3d object reconstruction,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 7114–7121.
- H. Fan, H. Su, and L. J. Guibas, “A point set generation network for 3d object reconstruction from a single image,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 605–613.
- L. Mescheder, M. Oechsle, M. Niemeyer, S. Nowozin, and A. Geiger, “Occupancy networks: Learning 3d reconstruction in function space,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 4460–4470.
- X. Wen, J. Zhou, Y.-S. Liu, H. Su, Z. Dong, and Z. Han, “3d shape reconstruction from 2d images with disentangled attribute flow,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 3803–3813.
- C. Lassner and M. Zollhöfer, “Pulsar: Efficient sphere-based neural rendering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 1440–1449.
- S. Liu, T. Li, W. Chen, and H. Li, “Soft rasterizer: A differentiable renderer for image-based 3d reasoning,” in Proc. IEEE Int. Conf. Comput. Vis., 2019.
- N. Ravi, J. Reizenstein, D. Novotný, T. Gordon, W. Lo, J. Johnson, and G. Gkioxari, “Accelerating 3d deep learning with pytorch3d,” CoRR, vol. abs/2007.08501, 2020.
- N. K. L., P. Mandikal, M. Agarwal, and R. V. Babu, “Capnet: Continuous approximation projection for 3d point cloud reconstruction using 2d supervision,” in Proc. AAAI Conf. Artif. Intell., 2019, pp. 8819–8826.
- N. K. L., P. Mandikal, V. Jampani, and R. V. Babu, “DIFFER: moving beyond 3d reconstruction with differentiable feature rendering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. Workshops, 2019, pp. 18–24.
- Z. Han, C. Chen, Y. Liu, and M. Zwicker, “DRWR: A differentiable renderer without rendering for unsupervised 3d structure learning from silhouette images,” in Proc. Int. Conf. Mach. Learn., vol. 119, 2020, pp. 3994–4005.
- S. Suzuki et al., “Topological structural analysis of digitized binary images by border following,” Comput. Vision, Graphics, and Image Process., vol. 30, no. 1, pp. 32–46, 1985.
- E. Insafutdinov and A. Dosovitskiy, “Unsupervised learning of shape and pose with differentiable point clouds,” Proc. Adv. Neural Inf. Process. Syst., vol. 31, 2018.
- Q. Xu, W. Wang, D. Ceylan, R. Mech, and U. Neumann, “Disn: Deep implicit surface network for high-quality single-view 3d reconstruction,” Proc. Adv. Neural Inf. Process. Syst., vol. 32, 2019.
- A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015.
- A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2012, pp. 3354–3361.
- L. Qi, L. Jiang, S. Liu, X. Shen, and J. Jia, “Amodal instance segmentation with KINS dataset,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 3014–3023.
- Lintai Wu (3 papers)
- Qijian Zhang (20 papers)
- Junhui Hou (138 papers)
- Yong Xu (432 papers)