Papers
Topics
Authors
Recent
2000 character limit reached

3D Keypoint Estimation Using Implicit Representation Learning (2306.11529v1)

Published 20 Jun 2023 in cs.CV

Abstract: In this paper, we tackle the challenging problem of 3D keypoint estimation of general objects using a novel implicit representation. Previous works have demonstrated promising results for keypoint prediction through direct coordinate regression or heatmap-based inference. However, these methods are commonly studied for specific subjects, such as human bodies and faces, which possess fixed keypoint structures. They also suffer in several practical scenarios where explicit or complete geometry is not given, including images and partial point clouds. Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints. Specifically, our key idea is employing spheres to represent 3D keypoints, thereby enabling the learnability of the corresponding signed distance field. Explicit keypoints can be extracted subsequently by our algorithm based on the Hough transform. Quantitative and qualitative evaluations also show the superiority of our representation in terms of prediction accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Monocular 3d object detection via geometric reasoning on keypoints. arXiv preprint arXiv:1905.05618 (2019).
  2. 3d facial expression recognition using sift descriptors of automatically detected keypoints. The Visual Computer 27, 11 (2011), 1021–1036.
  3. Barnea S., Filin S.: Keypoint based autonomous registration of terrestrial laser point-clouds. ISPRS Journal of Photogrammetry and Remote Sensing 63, 1 (2008), 19–35.
  4. Combining 3d model contour energy and keypoints for object tracking. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 53–69.
  5. Detection of geometric keypoints and its application to point cloud coarse registration. ISPRS-International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 49 (2016), 187–194.
  6. Bulat A., Tzimiropoulos G.: Human pose estimation via convolutional part heatmap regression. In European Conference on Computer Vision (2016), Springer, pp. 717–732.
  7. Implicit functions in feature space for 3d shape reconstruction and completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 6970–6981.
  8. Choi C., Christensen H. I.: Real-time 3d model-based tracking using edge and keypoint features for robotic manipulation. In 2010 IEEE International Conference on Robotics and Automation (2010), IEEE, pp. 4048–4055.
  9. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).
  10. Automatic keypoint detection on 3d faces using a dictionary of local shapes. In 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (2011), IEEE, pp. 204–211.
  11. Cavity versus ligand shape descriptors: Application to urokinase binding pockets. Journal of Computational Biology 24, 11 (2017), 1134–1137.
  12. 3d hough transform for sphere recognition on point clouds. Machine vision and applications 25, 7 (2014), 1877–1891.
  13. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision (2016), Springer, pp. 628–644.
  14. Occlusion-aware networks for 3d human pose estimation in video. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 723–732.
  15. Chen Z., Zhang H.: Learning implicit fields for generative shape modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 5939–5948.
  16. Multi-person 3d human pose estimation from monocular images. In 2019 International Conference on 3D Vision (3DV) (2019), IEEE, pp. 405–414.
  17. Doersch C., Zisserman A.: Sim2real transfer learning for 3d human pose estimation: motion to the rescue. Advances in Neural Information Processing Systems 32 (2019), 12949–12961.
  18. Noise-resilient training method for face landmark generation from speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2019), 27–38.
  19. Unsupervised learning of category-specific symmetric 3d keypoints from point sets. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16 (2020), Springer, pp. 546–563.
  20. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 605–613.
  21. Me-pcn: Point completion conditioned on mask emptiness. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2021), pp. 12488–12497.
  22. Bottom-up human pose estimation via disentangled keypoint regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 14676–14686.
  23. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020), pp. 11632–11641.
  24. Recurrent slice networks for 3d segmentation of point clouds. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018), pp. 2626–2635.
  25. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778.
  26. Self-supervised learning of 3d human pose using multi-view geometry. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 1077–1086.
  27. Learning compact geometric features. In Proceedings of the IEEE international conference on computer vision (2017), pp. 153–161.
  28. Lorensen W. E., Cline H. E.: Marching cubes: A high resolution 3d surface construction algorithm. ACM siggraph computer graphics 21, 4 (1987), 163–169.
  29. Towards 3d face recognition in the real: a registration-free approach using fine-grained matching of 3d keypoint descriptors. International Journal of Computer Vision 113, 2 (2015), 128–142.
  30. Li J., Lee G. H.: Usip: Unsupervised stable interest point detection from 3d point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 361–370.
  31. Smoke: Single-stage monocular 3d object detection via keypoint estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2020), pp. 996–997.
  32. Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE transactions on pattern analysis and machine intelligence 28, 10 (2006), 1584–1601.
  33. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 4460–4470.
  34. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision (2020), Springer, pp. 405–421.
  35. Novatnack J., Nishino K.: Scale-dependent 3d geometric features. In 2007 IEEE 11th International Conference on Computer Vision (2007), IEEE, pp. 1–8.
  36. Stacked hourglass networks for human pose estimation. In European conference on computer vision (2016), Springer, pp. 483–499.
  37. Making deep heatmaps robust to partial occlusions for 3d object pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 119–134.
  38. Flowing convnets for human pose estimation in videos. In Proceedings of the IEEE international conference on computer vision (2015), pp. 1913–1921.
  39. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 165–174.
  40. Coarse-to-fine volumetric prediction for single-image 3d human pose. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 7025–7034.
  41. Towards accurate multi-person pose estimation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 4903–4911.
  42. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 652–660.
  43. Sipiran I., Bustos B.: Harris 3d: a robust extension of the harris operator for interest point detection on 3d meshes. The Visual Computer 27, 11 (2011), 963–976.
  44. 3d tracking in unknown environments using on-line keypoint learning for mobile augmented reality. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2008), IEEE, pp. 1–8.
  45. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems 33 (2020).
  46. A concise and provably informative multi-scale signature based on heat diffusion. In Computer graphics forum (2009), vol. 28, Wiley Online Library, pp. 1383–1392.
  47. Discovery of latent 3d keypoints via end-to-end geometric reasoning. arXiv preprint arXiv:1807.03146 (2018).
  48. Skeleton merger: an unsupervised aligned keypoint detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 43–52.
  49. Unique signatures of histograms for local surface description. In European conference on computer vision (2010), Springer, pp. 356–369.
  50. Structured domain adaptation for 3d keypoint estimation. In 2019 International Conference on 3D Vision (3DV) (2019), IEEE, pp. 57–66.
  51. Learning 3d keypoint descriptors for non-rigid shape matching. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 3–19.
  52. Multi-task joint learning of 3d keypoint saliency and correspondence estimation. Computer-Aided Design 141 (2021), 103105.
  53. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog) 38, 5 (2019), 1–12.
  54. Ukpgan: Unsupervised keypoint ganeration. arXiv preprint arXiv:2011.11974 (2020).
  55. KeypointNet: A large-scale 3d keypoint dataset aggregated from numerous human annotations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 13647–13656.
  56. Simpoe: Simulated character control for 3d human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 7159–7169.
  57. Surface matching with salient keypoints in geodesic scale space. Computer Animation and Virtual Worlds 19, 3-4 (2008), 399–410.
  58. Zhong Y.: Intrinsic shape signatures: A shape descriptor for 3d object recognition. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops (2009), IEEE, pp. 689–696.
  59. Unsupervised domain adaptation for 3d keypoint estimation via view consistency. In Proceedings of the European conference on computer vision (ECCV) (2018), pp. 137–153.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.