Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-supervised adversarial masking for 3D point cloud representation learning (2307.05325v1)

Published 11 Jul 2023 in cs.CV

Abstract: Self-supervised methods have been proven effective for learning deep representations of 3D point cloud data. Although recent methods in this domain often rely on random masking of inputs, the results of this approach can be improved. We introduce PointCAM, a novel adversarial method for learning a masking function for point clouds. Our model utilizes a self-distillation framework with an online tokenizer for 3D point clouds. Compared to previous techniques that optimize patch-level and object-level objectives, we postulate applying an auxiliary network that learns how to select masks instead of choosing them randomly. Our results show that the learned masking function achieves state-of-the-art or competitive performance on various downstream tasks. The source code is available at https://github.com/szacho/pointcam.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Learning representations and generative models for 3d point clouds, 2018.
  2. Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding, 2022.
  3. data2vec: A general framework for self-supervised learning in speech, vision and language, 2022.
  4. Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
  5. Emerging properties in self-supervised vision transformers, 2021.
  6. Shapenet: An information-rich 3d model repository, 2015.
  7. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  8. Aligned contrastive predictive coding. arXiv preprint arXiv:2104.11946, 2021.
  9. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.
  10. View inter-prediction gan: Unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions, 2018.
  11. Multi-angle point cloud-vae: Unsupervised feature learning for 3d point clouds from multiple angles by joint self-reconstruction and half-to-half prediction, 2019.
  12. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
  13. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  14. Gaussian error linear units (gelus), 2020.
  15. So-net: Self-organizing network for point cloud analysis, 2018.
  16. Pointcnn: Convolution on x-transformed points. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  17. Masked discrimination for self-supervised learning on point clouds, 2022.
  18. Relation-shape convolutional neural network for point cloud analysis, 2019.
  19. Masked autoencoders for point cloud self-supervised learning, 2022.
  20. Pointnet: Deep learning on point sets for 3d classification and segmentation, 2017.
  21. Pointnet++: Deep hierarchical feature learning on point sets in a metric space, 2017.
  22. Self-supervised deep learning on point clouds by reconstructing space, 2019.
  23. Noam Shazeer. Glu variants improve transformer, 2020.
  24. Adversarial masking for self-supervised learning, 2022.
  25. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data, 2019.
  26. Learning localized generative models for 3d point clouds via graph convolution. In International Conference on Learning Representations, 2019.
  27. Scan: Learning to classify images without labels. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X, pages 268–285. Springer, 2020.
  28. Attention is all you need, 2017.
  29. Unsupervised point cloud pre-training via occlusion completion, 2021.
  30. Dynamic graph cnn for learning on point clouds, 2019.
  31. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling, 2017.
  32. 3d shapenets: A deep representation for volumetric shapes, 2015.
  33. Pointcontrast: Unsupervised pre-training for 3d point cloud understanding, 2020.
  34. Implicit autoencoder for point cloud self-supervised representation learning, 2023.
  35. Foldingnet: Point cloud auto-encoder via deep grid deformation, 2018.
  36. Point-bert: Pre-training 3d point cloud transformers with masked point modeling, 2022.
  37. Adversarial autoencoders for compact representations of 3d point clouds. Computer Vision and Image Understanding, 193:102921, 2020.
  38. Point-m2ae: Multi-scale masked autoencoders for hierarchical point cloud pre-training, 2022.
  39. kdecay: Just adding k-decay items on learning-rate schedule to improve neural networks, 2022.
  40. Self-supervised pretraining of 3d features on any point-cloud, 2021.
  41. Learning deep bilinear transformation for fine-grained image representation, 2019.
  42. ibot: Image bert pre-training with online tokenizer, 2022.
  43. 3d-oae: Occlusion auto-encoders for self-supervised learning on point clouds, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com