Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Joint Data and Feature Augmentation for Self-Supervised Representation Learning on Point Clouds (2211.01184v1)

Published 2 Nov 2022 in cs.CV and cs.GR

Abstract: To deal with the exhausting annotations, self-supervised representation learning from unlabeled point clouds has drawn much attention, especially centered on augmentation-based contrastive methods. However, specific augmentations hardly produce sufficient transferability to high-level tasks on different datasets. Besides, augmentations on point clouds may also change underlying semantics. To address the issues, we propose a simple but efficient augmentation fusion contrastive learning framework to combine data augmentations in Euclidean space and feature augmentations in feature space. In particular, we propose a data augmentation method based on sampling and graph generation. Meanwhile, we design a data augmentation network to enable a correspondence of representations by maximizing consistency between augmented graph pairs. We further design a feature augmentation network that encourages the model to learn representations invariant to the perturbations using an encoder perturbation. We comprehensively conduct extensive object classification experiments and object part segmentation experiments to validate the transferability of the proposed framework. Experimental results demonstrate that the proposed framework is effective to learn the point cloud representation in a self-supervised manner, and yields state-of-the-art results in the community. The source code is publicly available at: https://zhiyongsu.github.io/Project/AFSRL.html.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. , 2022. A deep architecture for log-euclidean fisher vector end-to-end learning with application to 3d point cloud classification. Graphical Models 123, 101164.
  2. , 2022. Graph-pbn: Graph-based parallel branch network for efficient point cloud learning. Graphical Models 119, 101120.
  3. Learning representations and generative models for 3d point clouds, in: International conference on machine learning, PMLR. pp. 40–49.
  4. Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9902–9912.
  5. Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091 .
  6. Deep point-based scene labeling with depth mapping and geometric patch feature encoding. Graphical Models 104, 101033. URL: https://www.sciencedirect.com/science/article/pii/S1524070319300244, doi:https://doi.org/10.1016/j.gmod.2019.101033.
  7. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 .
  8. A simple framework for contrastive learning of visual representations, in: International conference on machine learning, PMLR. pp. 1597–1607.
  9. Exploring simple siamese representation learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–15758.
  10. Scannet: Richly-annotated 3d reconstructions of indoor scenes, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5828–5839.
  11. Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 602–618.
  12. Unsupervised visual representation learning by context prediction, in: Proceedings of the IEEE international conference on computer vision, pp. 1422–1430.
  13. Self-contrastive learning with hard negative sampling for self-supervised point cloud learning, in: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3133–3142.
  14. Self-supervised learning on 3d point clouds by learning discrete generative models, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8248–8257.
  15. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems 33, 21271–21284.
  16. Inductive representation learning on large graphs. Advances in neural information processing systems 30.
  17. View inter-prediction gan: Unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions, in: Proceedings of the AAAI conference on artificial intelligence, pp. 8376–8384.
  18. Multi-angle point cloud-vae: Unsupervised feature learning for 3d point clouds from multiple angles by joint self-reconstruction and half-to-half prediction, in: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE. pp. 10441–10450.
  19. Momentum contrast for unsupervised visual representation learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9729–9738.
  20. An improved multi-view convolutional neural network for 3d object retrieval. IEEE Transactions on Image Processing 29, 7917–7930.
  21. Pointwise convolutional neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 984–993.
  22. Spatio-temporal self-supervised representation learning for 3d point clouds, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6535–6545.
  23. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 .
  24. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models, in: Proceedings of the IEEE international conference on computer vision, pp. 863–872.
  25. Stratified transformer for 3d point cloud segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8500–8509.
  26. Large-scale point cloud semantic segmentation with superpoint graphs, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4558–4567.
  27. Pointgrid: A deep network for 3d shape understanding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9204–9214.
  28. So-net: Self-organizing network for point cloud analysis, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9397–9406.
  29. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems 31.
  30. Point discriminative learning for unsupervised representation learning on 3d point clouds. arXiv preprint arXiv:2108.02104 .
  31. Point2sequence: Learning the shape representation of 3d point clouds with an attention-based sequence to sequence network, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 8778–8785.
  32. L2g auto-encoder: Understanding point clouds by local-to-global reconstruction with hierarchical self-attention, in: Proceedings of the 27th ACM International Conference on Multimedia, pp. 989–997.
  33. Exploiting unlabeled data in cnns by self-supervised learning to rank. IEEE transactions on pattern analysis and machine intelligence 41, 1862–1878.
  34. Densepoint: Learning densely contextual representation for efficient point cloud processing, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 5239–5248.
  35. Relation-shape convolutional neural network for point cloud analysis, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8895–8904.
  36. Voxnet: A 3d convolutional neural network for real-time object recognition, in: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE. pp. 922–928.
  37. An end-to-end transformer model for 3d object detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2906–2917.
  38. Jsis3d: Joint semantic-instance segmentation of 3d point clouds with multi-task pointwise networks and multi-value conditional random fields, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8827–8836.
  39. Self-supervised learning of point clouds via orientation estimation, in: 2020 International Conference on 3D Vision (3DV), IEEE. pp. 1018–1028.
  40. Deep hough voting for 3d object detection in point clouds, in: proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9277–9286.
  41. Pointnet: Deep learning on point sets for 3d classification and segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660.
  42. Volumetric and multi-view cnns for object classification on 3d data, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5648–5656.
  43. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30.
  44. Global-local bidirectional reasoning for unsupervised representation learning of 3d point clouds, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5376–5385.
  45. Info3d: Representation learning on 3d objects using mutual information maximization and contrastive learning, in: European Conference on Computer Vision, Springer. pp. 626–642.
  46. Self-supervised deep learning on point clouds by reconstructing space. Advances in Neural Information Processing Systems 32.
  47. Time-contrastive networks: Self-supervised learning from video, in: 2018 IEEE international conference on robotics and automation (ICRA), IEEE. pp. 1134–1141.
  48. Pointrcnn: 3d object proposal generation and detection from point cloud, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 770–779.
  49. Multi-view convolutional neural networks for 3d shape recognition, in: Proceedings of the IEEE international conference on computer vision, pp. 945–953.
  50. Tangent convolutions for dense prediction in 3d, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3887–3896.
  51. Kpconv: Flexible and deformable convolution for point clouds, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6411–6420.
  52. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1588–1597.
  53. Graph attention networks. arXiv preprint arXiv:1710.10903 .
  54. Unsupervised point cloud pre-training via occlusion completion, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 9782–9792.
  55. Self-supervised spatio-temporal representation learning for videos by predicting motion and appearance statistics, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4006–4015.
  56. Self-supervised video representation learning by pace prediction, in: European conference on computer vision, Springer. pp. 504–521.
  57. Associatively segmenting instances and semantics in point clouds, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4096–4105.
  58. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog) 38, 1–12.
  59. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in neural information processing systems 29.
  60. 3d shapenets: A deep representation for volumetric shapes, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920.
  61. Unsupervised feature learning via non-parametric instance discrimination, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3733–3742.
  62. Simgrace: A simple framework for graph contrastive learning without data augmentation, in: Proceedings of the ACM Web Conference 2022, pp. 1070–1079.
  63. Pointcontrast: Unsupervised pre-training for 3d point cloud understanding, in: European conference on computer vision, Springer. pp. 574–591.
  64. Attentional shapecontextnet for point cloud recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4606–4615.
  65. Projective feature learning for 3d shapes with multi-view depth images, in: Computer Graphics Forum, Wiley Online Library. pp. 1–11.
  66. Spidercnn: Deep learning on point sets with parameterized convolutional filters, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102.
  67. Foldingnet: Point cloud auto-encoder via deep grid deformation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 206–215.
  68. A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics (ToG) 35, 1–12.
  69. Gspn: Generative shape proposal network for 3d instance segmentation in point cloud, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3947–3956.
  70. Graph contrastive learning with augmentations. Advances in Neural Information Processing Systems 33, 5812–5823.
  71. Unsupervised feature learning for point cloud understanding by contrasting and clustering using graph convolutional neural networks, in: 2019 international conference on 3D vision (3DV), IEEE. pp. 395–404.
  72. Self-supervised pretraining of 3d features on any point-cloud, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10252–10263.
  73. H3dnet: 3d object detection using hybrid geometric primitives, in: European Conference on Computer Vision, Springer. pp. 311–329.
  74. Point transformer, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16259–16268.
  75. 3d point capsule networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1009–1018.
  76. Unsupervised learning from video with deep neural embeddings, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9563–9572.
  77. Self-supervised learning of object parts for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14502–14511.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zhuheng Lu (3 papers)
  2. Yuewei Dai (3 papers)
  3. Weiqing Li (19 papers)
  4. Zhiyong Su (9 papers)
Citations (4)