SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic Inspection (2403.06877v1)
Abstract: We present a neural-field-based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photo-realistic textures. This system adapts the state-of-the-art neural radiance field (NeRF) representation to also incorporate lidar data which adds strong geometric constraints on the depth and surface normals. We exploit the trajectory from a real-time lidar SLAM system to bootstrap a Structure-from-Motion (SfM) procedure to both significantly reduce the computation time and to provide metric scale which is crucial for lidar depth loss. We use submapping to scale the system to large-scale environments captured over long trajectories. We demonstrate the reconstruction system with data from a multi-camera, lidar sensor suite onboard a legged robot, hand-held while scanning building scenes for 600 metres, and onboard an aerial robot surveying a multi-storey mock disaster site-building. Website: https://ori-drs.github.io/projects/silvr/
- J. L. Schönberger and J.-M. Frahm, “Structure-from-motion revisited,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2016.
- B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
- M. Tancik, E. Weber, E. Ng, R. Li, B. Yi, J. Kerr, T. Wang, A. Kristoffersen, J. Austin, K. Salahi, A. Ahuja, D. McAllister, and A. Kanazawa, “Nerfstudio: A modular framework for neural radiance field development,” in SIGGRAPH, 2023.
- T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,” ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 102:1–102:15, Jul. 2022.
- K. Deng, A. Liu, J.-Y. Zhu, and D. Ramanan, “Depth-supervised nerf: Fewer views and faster training for free,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2022, pp. 12 882–12 891.
- Z. Yu, S. Peng, M. Niemeyer, T. Sattler, and A. Geiger, “MonoSDF: Exploring monocular geometric cues for neural implicit surface reconstruction,” Conf. on Neural Information Processing Systems (NeurIPS), 2022.
- D. Wisth, M. Camurri, and M. Fallon, “VILENS: Visual, inertial, lidar, and leg odometry for all-terrain legged robots,” IEEE Trans. Robotics, vol. 39, no. 1, pp. 309–326, 2023.
- J. Behley and C. Stachniss, “Efficient surfel-based slam using 3d laser range data in urban environments,” in Robotics: Science and Systems (RSS), 2018.
- J. Lin and F. Zhang, “R3LIVE: A robust, real-time, rgb-colored, lidar-inertial-visual tightly-coupled state estimation and mapping package,” in IEEE Intl. Conf. on Robotics and Automation (ICRA), 2022, pp. 10 672–10 678.
- J. Zhang and S. Singh, “LOAM: Lidar odometry and mapping in real-time.” in Robotics: Science and Systems (RSS), vol. 2, no. 9. Berkeley, CA, 2014, pp. 1–9.
- S. Zhao, H. Zhang, P. Wang, L. Nogueira, and S. Scherer, “Super odometry: Imu-centric lidar-visual-inertial estimator for challenging environments,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 8729–8736.
- W. Xu, Y. Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar-inertial odometry,” IEEE Trans. Robotics, vol. 38, no. 4, pp. 2053–2073, 2022.
- Y. Tao, M. Popović, Y. Wang, S. T. Digumarti, N. Chebrolu, and M. Fallon, “3d lidar reconstruction with probabilistic depth completion for robotic navigation,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 5339–5346.
- Y. Furukawa and J. Ponce, “Accurate, dense, and robust multiview stereopsis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 8, pp. 1362–1376, 2010.
- K. Rematas, A. Liu, P. P. Srinivasan, J. T. Barron, A. Tagliasacchi, T. Funkhouser, and V. Ferrari, “Urban radiance fields,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2022, pp. 12 932–12 942.
- M. Bosse, P. Newman, J. Leonard, M. Soika, W. Feiten, and S. Teller, “An atlas framework for scalable mapping,” in IEEE Intl. Conf. on Robotics and Automation (ICRA), vol. 2, 2003, pp. 1899–1906 vol.2.
- B.-J. Ho, P. Sodhi, P. Teixeira, M. Hsiao, T. Kusnur, and M. Kaess, “Virtual occupancy grid map for submap-based pose graph slam and planning in 3d environments,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 2018, pp. 2175–2182.
- V. Reijgwart, A. Millane, H. Oleynikova, R. Siegwart, C. Cadena, and J. Nieto, “Voxgraph: Globally consistent, volumetric mapping using signed distance function submaps,” IEEE Robotics and Automation Letters, 2020.
- Y. Wang, M. Ramezani, M. Mattamala, S. T. Digumarti, and M. Fallon, “Strategies for large scale elastic and semantic lidar reconstruction,” Journal of Robotics and Autonomous Systems, vol. 155, p. 104185, 2022.
- M. Tancik, V. Casser, X. Yan, S. Pradhan, B. Mildenhall, P. P. Srinivasan, J. T. Barron, and H. Kretzschmar, “Block-nerf: Scalable large scene neural view synthesis,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2022, pp. 8248–8258.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Conf. on Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
- S. Fridovich-Keil, A. Yu, M. Tancik, Q. Chen, B. Recht, and A. Kanazawa, “Plenoxels: Radiance fields without neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5501–5510.
- B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, July 2023.
- A. Yu, R. Li, M. Tancik, H. Li, R. Ng, and A. Kanazawa, “PlenOctrees for real-time rendering of neural radiance fields,” in Intl. Conf. on Computer Vision (ICCV), 2021.
- J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-NeRF 360: Unbounded anti-aliased neural radiance fields,” IEEE Int. Conf. Computer Vision and Pattern Recognition, 2022.
- R. Martin-Brualla, N. Radwan, M. S. M. Sajjadi, J. T. Barron, A. Dosovitskiy, and D. Duckworth, “NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2021.
- Z. Li, T. Müller, A. Evans, R. H. Taylor, M. Unberath, M.-Y. Liu, and C.-H. Lin, “Neuralangelo: High-fidelity neural surface reconstruction,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2023.
- L. Yariv, J. Gu, Y. Kasten, and Y. Lipman, “Volume rendering of neural implicit surfaces,” Conf. on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 4805–4815, 2021.
- P. Wang, L. Liu, Y. Liu, C. Theobalt, T. Komura, and W. Wang, “NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction,” Conf. on Neural Information Processing Systems (NeurIPS), 2021.
- X. Zhong, Y. Pan, J. Behley, and C. Stachniss, “Shine-mapping: Large-scale 3d mapping using sparse hierarchical implicit neural representations,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2023.
- J. Deng, Q. Wu, X. Chen, S. Xia, Z. Sun, G. Liu, W. Yu, and L. Pei, “Nerf-loam: Neural implicit representation for large-scale incremental lidar odometry and mapping,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- S. Lombardi, T. Simon, J. Saragih, G. Schwartz, A. Lehrmann, and Y. Sheikh, “Neural volumes: learning dynamic renderable volumes from images,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–14, 2019.
- V. Sitzmann, J. Thies, F. Heide, M. Nießner, G. Wetzstein, and M. Zollhofer, “Deepvoxels: Learning persistent 3d feature embeddings,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2019, pp. 2437–2446.
- V. Sitzmann, M. Zollhöfer, and G. Wetzstein, “Scene representation networks: Continuous 3d-structure-aware neural scene representations,” Conf. on Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
- N. Max, “Optical models for direct volume rendering,” IEEE Trans. on Visualization and Computer Graphics, vol. 1, no. 2, pp. 99–108, 1995.
- J. T. Kajiya and B. P. Von Herzen, “Ray tracing volume densities,” SIGGRAPH, vol. 18, no. 3, pp. 165–174, 1984.
- Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo, and R. Girshick, “Detectron2,” https://github.com/facebookresearch/detectron2, 2019.
- D. Azinović, R. Martin-Brualla, D. B. Goldman, M. Nießner, and J. Thies, “Neural rgb-d surface reconstruction,” in IEEE Int. Conf. Computer Vision and Pattern Recognition, 2022, pp. 6290–6301.
- U. Von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, pp. 395–416, 2007.
- P. Furgale, J. Rehder, and R. Siegwart, “Unified temporal and spatial calibration for multi-sensor systems,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS). IEEE, 2013, pp. 1280–1286.
- L. F. T. Fu, N. Chebrolu, and M. Fallon, “Extrinsic calibration of camera to lidar using a differentiable checkerboard model,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 2023.
- H. Aanæs, R. R. Jensen, G. Vogiatzis, E. Tola, and A. B. Dahl, “Large-scale data for multiple-view stereopsis,” International Journal of Computer Vision, pp. 1–16, 2016.
- Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.