Papers
Topics
Authors
Recent
Search
2000 character limit reached

3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion

Published 10 Apr 2024 in cs.CV and cs.GR | (2404.07106v1)

Abstract: Point cloud completion aims to generate a complete and high-fidelity point cloud from an initially incomplete and low-quality input. A prevalent strategy involves leveraging Transformer-based models to encode global features and facilitate the reconstruction process. However, the adoption of pooling operations to obtain global feature representations often results in the loss of local details within the point cloud. Moreover, the attention mechanism inherent in Transformers introduces additional computational complexity, rendering it challenging to handle long sequences effectively. To address these issues, we propose 3DMambaComplete, a point cloud completion network built on the novel Mamba framework. It comprises three modules: HyperPoint Generation encodes point cloud features using Mamba's selection mechanism and predicts a set of Hyperpoints. A specific offset is estimated, and the down-sampled points become HyperPoints. The HyperPoint Spread module disperses these HyperPoints across different spatial locations to avoid concentration. Finally, a deformation method transforms the 2D mesh representation of HyperPoints into a fine-grained 3D structure for point cloud reconstruction. Extensive experiments conducted on various established benchmarks demonstrate that 3DMambaComplete surpasses state-of-the-art point cloud completion methods, as confirmed by qualitative and quantitative analyses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Learning representations and generative models for 3d point clouds. In International conference on machine learning. PMLR, 40–49.
  2. Learning a structured latent space for unsupervised point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5543–5553.
  3. AnchorFormer: Point Cloud Completion From Discriminative Nodes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13581–13590.
  4. François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251–1258.
  5. Comprehensive review of deep learning-based 3d point cloud completion processing and analysis. IEEE Transactions on Intelligent Transportation Systems 23, 12 (2022), 22862–22883.
  6. VQ-DcTr: Vector-quantized autoencoder with dual-channel transformer points splitting for 3D point cloud completion. In Proceedings of the 30th ACM international conference on multimedia. 4769–4778.
  7. DcTr: Noise-robust point cloud completion by dual-channel transformer with cross-attention. Pattern Recognition 133 (2023), 109051.
  8. Progressive Growth for Point Cloud Completion by Surface-Projection Optimization. IEEE Transactions on Intelligent Vehicles (2024).
  9. Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).
  10. Hippo: Recurrent memory with optimal polynomial projections. Advances in neural information processing systems 33 (2020), 1474–1487.
  11. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021).
  12. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems 34 (2021), 572–585.
  13. Mambamorph: a mamba-based backbone with contrastive feature learning for deformable mr-ct registration. arXiv preprint arXiv:2401.13934 (2024).
  14. Pf-net: Point fractal network for 3d point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7662–7670.
  15. ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9466–9475.
  16. PointMamba: A Simple State Space Model for Point Cloud Analysis. arXiv preprint arXiv:2402.10739 (2024).
  17. Swin-umamba: Mamba-based unet with imagenet-based pretraining. arXiv preprint arXiv:2402.03302 (2024).
  18. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166 (2024).
  19. Shitong Luo and Wei Hu. 2021. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2837–2845.
  20. A conditional point diffusion-refinement paradigm for 3d point cloud completion. arXiv preprint arXiv:2112.03530 (2021).
  21. U-mamba: Enhancing long-range dependency for biomedical image segmentation. arXiv preprint arXiv:2401.04722 (2024).
  22. Long range language modeling via gated state spaces. arXiv preprint arXiv:2206.13947 (2022).
  23. Autosdf: Shape priors for 3d completion, reconstruction and generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 306–315.
  24. Point-set distances for learning representations of 3d point clouds. In Proceedings of the IEEE/CVF international conference on computer vision. 10478–10487.
  25. Liang Pan. 2020. ECG: Edge-aware point cloud completion with graph convolution. IEEE Robotics and Automation Letters 5, 3 (2020), 4392–4398.
  26. Masked autoencoders for point cloud self-supervised learning. In European conference on computer vision. Springer, 604–621.
  27. Block-state transformers. Advances in Neural Information Processing Systems 36 (2024).
  28. Moe-mamba: Efficient selective state space models with mixture of experts. arXiv preprint arXiv:2401.04081 (2024).
  29. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.
  30. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017).
  31. Jiacheng Ruan and Suncheng Xiang. 2024. Vm-unet: Vision mamba unet for medical image segmentation. arXiv preprint arXiv:2402.02491 (2024).
  32. Simplified state space layers for sequence modeling. arXiv preprint arXiv:2208.04933 (2022).
  33. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In Australasian joint conference on artificial intelligence. Springer, 1015–1021.
  34. Topnet: Structural point cloud decoder. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 383–392.
  35. Point Cloud Completion: A Survey. IEEE Transactions on Visualization and Computer Graphics (2023).
  36. Graph-mamba: Towards long-range graph sequence modeling with selective state spaces. arXiv preprint arXiv:2402.00789 (2024).
  37. Mambabyte: Token-free selective state space model. arXiv preprint arXiv:2401.13660 (2024).
  38. Cascaded refinement network for point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 790–799.
  39. Learning local displacements for point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1568–1577.
  40. SK-Net: Deep learning on point cloud via end-to-end discovery of spatial keypoints. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6422–6429.
  41. Asfm-net: Asymmetrical siamese feature matching network for point completion. In Proceedings of the 29th ACM international conference on multimedia. 1938–1947.
  42. Snowflakenet: Point cloud completion by snowflake point deconvolution with skip-transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 5499–5509.
  43. Grnet: Gridding residual network for dense point cloud completion. In European Conference on Computer Vision. Springer, 365–381.
  44. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. arXiv preprint arXiv:2401.13560 (2024).
  45. Shapeformer: Transformer-based shape completion via sparse representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6239–6249.
  46. Foldingnet: Point cloud auto-encoder via deep grid deformation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 206–215.
  47. Vivim: a video vision mamba for medical video object segmentation. arXiv preprint arXiv:2401.14168 (2024).
  48. Zi Ye and Tianxiang Chen. 2024. P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation. arXiv preprint arXiv:2402.08506 (2024).
  49. Pointr: Diverse point cloud completion with geometry-aware transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 12498–12507.
  50. Point-bert: Pre-training 3d point cloud transformers with masked point modeling. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 19313–19322.
  51. Pcn: Point completion network. In 2018 international conference on 3D vision (3DV). IEEE, 728–737.
  52. Unsupervised 3d shape completion through gan inversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1768–1777.
  53. Learning Density Regulated and Multi-View Consistent Unsigned Distance Fields. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8366–8370.
  54. Point Could Mamba: Point Cloud Learning via State Space Model. arXiv preprint arXiv:2403.00762 (2024).
  55. View-guided point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15890–15899.
  56. Zhuoran Zheng and Jun Zhang. 2024. FD-Vision Mamba for Endoscopic Exposure Correction. arXiv preprint arXiv:2402.06378 (2024).
  57. Seedformer: Patch seeds based point cloud completion with upsample transformer. In European conference on computer vision. Springer, 416–432.
  58. 3d shape generation and completion through point-voxel diffusion. In Proceedings of the IEEE/CVF international conference on computer vision. 5826–5835.
  59. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417 (2024).
Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.