Normal Transformer: Extracting Surface Geometry from LiDAR Points Enhanced by Visual Semantics
Abstract: High-quality surface normal can help improve geometry estimation in problems faced by autonomous vehicles, such as collision avoidance and occlusion inference. While a considerable volume of literature focuses on densely scanned indoor scenarios, normal estimation during autonomous driving remains an intricate problem due to the sparse, non-uniform, and noisy nature of real-world LiDAR scans. In this paper, we introduce a multi-modal technique that leverages 3D point clouds and 2D colour images obtained from LiDAR and camera sensors for surface normal estimation. We present the Hybrid Geometric Transformer (HGT), a novel transformer-based neural network architecture that proficiently fuses visual semantic and 3D geometric information. Furthermore, we developed an effective learning strategy for the multi-modal data. Experimental results demonstrate the superior effectiveness of our information fusion approach compared to existing methods. It has also been verified that the proposed model can learn from a simulated 3D environment that mimics a traffic scene. The learned geometric knowledge is transferable and can be applied to real-world 3D scenes in the KITTI dataset. Further tasks built upon the estimated normal vectors in the KITTI dataset show that the proposed estimator has an advantage over existing methods.
- Voronoi-based variational reconstruction of unoriented point sets. In Proceedings of the Fifth Eurographics Symposium on Geometry Processing, Barcelona, Spain, July 4-6, 2007, volume 257 of ACM International Conference Proceeding Series, 39–48. Eurographics Association.
- Surface Reconstruction by Voronoi Filtering. Discret. Comput. Geom., 22(4): 481–504.
- Marr Revisited: 2D-3D Alignment via Surface Normal Prediction. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, 5965–5974. IEEE Computer Society.
- Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 10112–10120. Computer Vision Foundation / IEEE.
- Depth Map Inpainting under a Second-Order Smoothness Prior. In Image Analysis, 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings, volume 7944 of Lecture Notes in Computer Science, 555–566. Springer.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
- Depth Map Prediction from a Single Image using a Multi-Scale Deep Network. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, 2366–2374.
- Three-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator. CoRR, abs/2005.08165.
- UltraStereo: Efficient Learning-Based Matching for Active Stereo Systems. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 6535–6544. IEEE Computer Society.
- Vision meets robotics: The KITTI dataset. Int. J. Robotics Res., 32(11): 1231–1237.
- Guided depth enhancement via a fast marching method. Image Vis. Comput., 31(10): 695–703.
- Algebraic point set surfaces. ACM Trans. Graph., 26(3): 23.
- PCPNet Learning Local Shape Properties from Raw Point Clouds. Comput. Graph. Forum, 37(2): 75–85.
- PCT: Point cloud transformer. Comput. Vis. Media, 7(2): 187–199.
- Haas, J. K. 2014. A history of the unity game engine.
- Surface reconstruction from unorganized points. In Proceedings of the 19th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1992, Chicago, IL, USA, July 27-31, 1992, 71–78. ACM.
- Unity: A General Platform for Intelligent Agents. CoRR, abs/1809.02627.
- Poisson surface reconstruction. In Proceedings of the Fourth Eurographics Symposium on Geometry Processing, Cagliari, Sardinia, Italy, June 26-28, 2006, volume 256 of ACM International Conference Proceeding Series, 61–70. Eurographics Association.
- Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Visualization of LIDAR datasets using point-based rendering technique. Comput. Geosci., 36(11): 1443–1450.
- Deep Iterative Surface Normal Estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, 11244–11253. IEEE.
- Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, 1119–1127. IEEE Computer Society.
- Comparing the capability of low- and high-resolution LiDAR data with application to solar resource assessment, roof type classification and shading analysis. Applied Energy, 205: 1216–1230.
- Guided inpainting and filtering for Kinect depth maps. In Proceedings of the 21st International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15, 2012, 2055–2058. IEEE Computer Society.
- Voronoi-Based Curvature and Feature Estimation from Point Clouds. IEEE Trans. Vis. Comput. Graph., 17(6): 743–756.
- Estimating surface normals in noisy point cloud data. Int. J. Comput. Geom. Appl., 14(4-5): 261–276.
- EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, 5363–5370. AAAI Press.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, 8024–8035. Curran Associates, Inc.
- Shape modeling with point-sampled geometry. ACM Trans. Graph., 22(3): 641–650.
- A Review of Point Cloud Registration Algorithms for Mobile Robotics. Found. Trends Robotics, 4(1): 1–104.
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 77–85. IEEE Computer Society.
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 5099–5108.
- GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, 283–291. IEEE Computer Society.
- GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation. CoRR, abs/2012.06980.
- DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 3313–3322. Computer Vision Foundation / IEEE.
- U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III, volume 9351 of Lecture Notes in Computer Science, 234–241. Springer.
- How Does Batch Normalization Help Optimization? In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, 2488–2498.
- A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Networks Learn. Syst., 32(1): 4–24.
- SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18(10): 3337.
- Deep Surface Normal Estimation With Hierarchical RGB-D Fusion. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 6153–6162. Computer Vision Foundation / IEEE.
- Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 5057–5065. IEEE Computer Society.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.