Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GridFormer: Point-Grid Transformer for Surface Reconstruction (2401.02292v1)

Published 4 Jan 2024 in cs.CV

Abstract: Implicit neural networks have emerged as a crucial technology in 3D surface reconstruction. To reconstruct continuous surfaces from discrete point clouds, encoding the input points into regular grid features (plane or volume) has been commonly employed in existing approaches. However, these methods typically use the grid as an index for uniformly scattering point features. Compared with the irregular point features, the regular grid features may sacrifice some reconstruction details but improve efficiency. To take full advantage of these two types of features, we introduce a novel and high-efficiency attention mechanism between the grid and point features named Point-Grid Transformer (GridFormer). This mechanism treats the grid as a transfer point connecting the space and point cloud. Our method maximizes the spatial expressiveness of grid features and maintains computational efficiency. Furthermore, optimizing predictions over the entire space could potentially result in blurred boundaries. To address this issue, we further propose a boundary optimization strategy incorporating margin binary cross-entropy loss and boundary sampling. This approach enables us to achieve a more precise representation of the object structure. Our experiments validate that our method is effective and outperforms the state-of-the-art approaches under widely used benchmarks by producing more precise geometry reconstructions. The code is available at https://github.com/list17/GridFormer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. SAL: Sign Agnostic Learning of Shapes From Raw Data. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  2. Surface Reconstruction from Point Clouds by Learning Predictive Context Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  3. Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces. In International Conference on Machine Learning (ICML).
  4. DiGS: Divergence guided shape implicit neural representation for unoriented point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19323–19332.
  5. POCO: Point Convolution for Surface Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6302–6314.
  6. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chicago.
  7. Latent Partition Implicit with Surface Codes for 3D Representation. In European Conference on Computer Vision (ECCV).
  8. Learning Implicit Fields for Generative Shape Modeling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  9. Multiresolution Deep Implicit Functions for 3D Shape Representation. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV. IEEE.
  10. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
  11. Chollet, F. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1251–1258.
  12. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV).
  13. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5828–5839.
  14. Mesh r-cnn. In Proceedings of the IEEE/CVF international conference on computer vision, 9785–9795.
  15. Implicit Geometric Regularization for Learning Shapes. In Proceedings of Machine Learning and Systems 2020, 3569–3579.
  16. AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
  17. 3D-CODED: 3D Correspondences by Deep Deformation. In European Conference on Computer Vision.
  18. PCT: Point cloud transformer. Computational Visual Media, 7(2): 187–199.
  19. Neural Kernel Surface Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4369–4379.
  20. Screened poisson surface reconstruction. ACM Transactions on Graphics (ToG), 32(3): 1–13.
  21. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  22. Learning Deep Implicit Functions for 3D Shapes with Dynamic Code Clouds. In IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  23. Dynamic Plane Convolutional Occupancy Networks. In Winter Conference on Applications of Computer Vision (WACV).
  24. Marching Cubes: A High Resolution 3D Surface Construction Algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’87, 163–169. New York, NY, USA: Association for Computing Machinery. ISBN 0897912276.
  25. Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment. In Conference on Computer Vision and Pattern Recognition (CVPR).
  26. VoxNet: A 3D Convolutional Neural Network for real-time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
  27. Occupancy Networks: Learning 3D Reconstruction in Function Space. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  28. Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks. In Proceedings of the IEEE International Conference on Computer Vision, 9964–9973.
  29. Fast Point Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16949–16958.
  30. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  31. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  32. Shape As Points: A Differentiable Poisson Solver. In Advances in Neural Information Processing Systems (NeurIPS).
  33. Convolutional Occupancy Networks. In European Conference on Computer Vision (ECCV).
  34. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. arXiv preprint arXiv:1612.00593.
  35. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv preprint arXiv:1706.02413.
  36. PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization. arXiv preprint arXiv:1905.05172.
  37. PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  38. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  39. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes.
  40. SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision.
  41. Contrastive Boundary Learning for Point Cloud Segmentation. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8479–8489.
  42. What Do Single-view 3D Reconstruction Networks Learn?
  43. Attention is all you need. Advances in neural information processing systems, 30.
  44. ALTO: Alternating Latent Topologies for Implicit 3D Reconstruction. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
  45. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  46. 3DILG: Irregular Latent Grids for 3D Generative Modeling. In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems.
  47. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 16259–16268.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com