Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity Monocular Dense Mapping (2401.03203v1)

Published 6 Jan 2024 in cs.CV

Abstract: In this paper, we introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF). Hi-Map is exceptional in its capacity to achieve efficient and high-fidelity mapping using only posed RGB inputs. Our method eliminates the need for external depth priors derived from e.g., a depth estimation model. Our key idea is to represent the scene as a hierarchical feature grid that encodes the radiance and then factorizes it into feature planes and vectors. As such, the scene representation becomes simpler and more generalizable for fast and smooth convergence on new observations. This allows for efficient computation while alleviating noise patterns by reducing the complexity of the scene representation. Buttressed by the hierarchical factorized representation, we leverage the Sign Distance Field (SDF) as a proxy of rendering for inferring the volume density, demonstrating high mapping fidelity. Moreover, we introduce a dual-path encoding strategy to strengthen the photometric cues and further boost the mapping quality, especially for the distant and textureless regions. Extensive experiments demonstrate our method's superiority in geometric and textural accuracy over the state-of-the-art NeRF-based monocular mapping methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. J. Xiao, P. Wang, H. Lu, and H. Zhang, “A three-dimensional mapping and virtual reality-based human–robot interaction for collaborative space exploration,” International Journal of Advanced Robotic Systems, vol. 17, no. 3, p. 1729881420925293, 2020.
  2. J. Du, W. Sheng, and M. Liu, “A human–robot collaborative system for robust three-dimensional mapping,” IEEE/ASME Transactions on Mechatronics, vol. 23, no. 5, pp. 2358–2368, 2018.
  3. A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, “Octomap: An efficient probabilistic 3d mapping framework based on octrees,” Autonomous robots, vol. 34, pp. 189–206, 2013.
  4. A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt, “Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration,” ACM Transactions on Graphics (ToG), vol. 36, no. 4, p. 1, 2017.
  5. M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger, “Real-time 3d reconstruction at scale using voxel hashing,” ACM Transactions on Graphics (ToG), vol. 32, no. 6, pp. 1–11, 2013.
  6. K. Wang, F. Gao, and S. Shen, “Real-time scalable dense surfel mapping,” in 2019 International conference on robotics and automation (ICRA).   IEEE, 2019, pp. 6919–6925.
  7. K. Schauwecker and A. Zell, “Robust and efficient volumetric occupancy mapping with an application to stereo vision,” in 2014 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2014, pp. 6102–6107.
  8. R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. W. Fitzgibbon, “Kinectfusion: Real-time dense surface mapping and tracking,” 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136, 2011. [Online]. Available: https://api.semanticscholar.org/CorpusID:11830123
  9. A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt, “Bundlefusion,” ACM Transactions on Graphics (TOG), vol. 36, pp. 1 – 18, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:32286806
  10. E. Vespa, N. Nikolov, M. Grimm, L. Nardi, P. H. Kelly, and S. Leutenegger, “Efficient octree-based volumetric slam supporting signed-distance and occupancy mapping,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1144–1151, 2018.
  11. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
  12. E. Sucar, S. Liu, J. Ortiz, and A. J. Davison, “imap: Implicit mapping and positioning in real-time,” in 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021.   IEEE, 2021, pp. 6209–6218. [Online]. Available: https://doi.org/10.1109/ICCV48922.2021.00617
  13. Z. Zhu, S. Peng, V. Larsson, W. Xu, H. Bao, Z. Cui, M. R. Oswald, and M. Pollefeys, “NICE-SLAM: neural implicit scalable encoding for SLAM,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022.   IEEE, 2022, pp. 12 776–12 786. [Online]. Available: https://doi.org/10.1109/CVPR52688.2022.01245
  14. X. Yang, H. Li, H. Zhai, Y. Ming, Y. Liu, and G. Zhang, “Vox-fusion: Dense tracking and mapping with voxel-based neural implicit representation,” in 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).   IEEE, 2022, pp. 499–507.
  15. M. M. Johari, C. Carta, and F. Fleuret, “ESLAM: efficient dense SLAM system based on hybrid representation of signed distance fields,” CoRR, vol. abs/2211.11704, 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2211.11704
  16. H. Wang, J. Wang, and L. Agapito, “Co-slam: Joint coordinate and sparse parametric encodings for neural real-time slam,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 293–13 302.
  17. C. Jiang, H.-Q. Zhang, P. Liu, Z. Yu, H. Cheng, B. Zhou, and S. Shen, “H$_{2}$-mapping: Real-time dense mapping using hierarchical hybrid representation,” IEEE Robotics and Automation Letters, vol. 8, pp. 6787–6794, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:259089291
  18. J. Liu and H. Chen, “Towards real-time scalable dense mapping using robot-centric implicit representation,” arXiv preprint arXiv:2306.10472, 2023.
  19. X. Zhong, Y. Pan, J. Behley, and C. Stachniss, “Shine-mapping: Large-scale 3d mapping using sparse hierarchical implicit neural representations,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 8371–8377.
  20. Y. Tang, J. Zhang, Z. Yu, H. Wang, and K. Xu, “Mips-fusion: Multi-implicit-submaps for scalable and robust online neural rgb-d reconstruction,” arXiv preprint arXiv:2308.08741, 2023.
  21. X. Liu, Y. Li, Y. Teng, H. Bao, G. Zhang, Y. Zhang, and Z. Cui, “Multi-modal neural radiance field for monocular dense slam with a light-weight tof sensor,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 1–11.
  22. A. Rosinol, J. J. Leonard, and L. Carlone, “Nerf-slam: Real-time dense monocular slam with neural radiance fields,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2023, pp. 3437–3444.
  23. C.-M. Chung, Y.-C. Tseng, Y.-C. Hsu, X.-Q. Shi, Y.-H. Hua, J.-F. Yeh, W.-C. Chen, Y.-T. Chen, and W. H. Hsu, “Orbeez-slam: A real-time monocular visual slam with orb features and nerf-realized mapping,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 9400–9406.
  24. Y. Zhang, F. Tosi, S. Mattoccia, and M. Poggi, “Go-slam: Global optimization for consistent 3d instant reconstruction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3727–3737.
  25. H. Matsuki, K. Tateno, M. Niemeyer, and F. Tombari, “Newton: Neural view-centric mapping for on-the-fly large-scale slam,” arXiv preprint arXiv:2303.13654, 2023.
  26. W. Zhang, T. Sun, S. Wang, Q. Cheng, and N. Haala, “Hi-slam: Monocular real-time dense mapping with hybrid implicit fields,” arXiv preprint arXiv:2310.04787, 2023.
  27. Z. Zhu, S. Peng, V. Larsson, Z. Cui, M. R. Oswald, A. Geiger, and M. Pollefeys, “Nicer-slam: Neural implicit scene encoding for rgb slam,” arXiv preprint arXiv:2302.03594, 2023.
  28. H. Li, X. Gu, W. Yuan, Z. Dong, P. Tan, et al., “Dense rgb slam with neural implicit maps,” in The Eleventh International Conference on Learning Representations, 2022.
  29. A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “Tensorf: Tensorial radiance fields,” in European Conference on Computer Vision.   Springer, 2022, pp. 333–350.
  30. J. Straub, T. Whelan, L. Ma, Y. Chen, E. Wijmans, S. Green, J. J. Engel, R. Mur-Artal, C. Ren, S. Verma, A. Clarkson, M. Yan, B. Budge, Y. Yan, X. Pan, J. Yon, Y. Zou, K. Leon, N. Carter, J. Briales, T. Gillingham, E. Mueggler, L. Pesqueira, M. Savva, D. Batra, H. M. Strasdat, R. D. Nardi, M. Goesele, S. Lovegrove, and R. Newcombe, “The Replica dataset: A digital replica of indoor spaces,” arXiv preprint arXiv:1906.05797, 2019.
  31. H. Matsuki, E. Sucar, T. Laidow, K. Wada, R. Scona, and A. J. Davison, “imode: Real-time incremental monocular dense mapping using neural field,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 4171–4177.
  32. T. Schöps, T. Sattler, and M. Pollefeys, “Bad slam: Bundle adjusted direct rgb-d slam,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 134–144, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:196201321
  33. Z. Yan, M. Ye, and L. Ren, “Dense visual slam with probabilistic surfel map,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, pp. 2389–2398, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:8013890
  34. H. Liu, C. Li, G. Chen, G. Zhang, M. Kaess, and H. Bao, “Robust keyframe-based dense slam with an rgb-d camera,” ArXiv, vol. abs/1711.05166, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:13314907
  35. T. Whelan, R. F. Salas-Moreno, B. Glocker, A. J. Davison, and S. Leutenegger, “Elasticfusion: Real-time dense slam and light source estimation,” The International Journal of Robotics Research, vol. 35, pp. 1697 – 1716, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:21124365
  36. T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J. J. Leonard, and J. McDonald, “Real-time large-scale dense rgb-d slam with volumetric fusion,” The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 598–626, 2015.
  37. J. Deng, Q. Wu, X. Chen, S. Xia, Z. Sun, G. Liu, W. Yu, and L. Pei, “Nerf-loam: Neural implicit representation for large-scale incremental lidar odometry and mapping,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 8218–8227.
  38. S. Liu and J. Zhu, “Efficient map fusion for multiple implicit slam agents,” IEEE Transactions on Intelligent Vehicles, 2023.
  39. B. Xiang, Y. Sun, Z. Xie, X. Yang, and Y. Wang, “Nisb-map: Scalable mapping with neural implicit spatial block,” IEEE Robotics and Automation Letters, 2023.
  40. J. Hu, M. Mao, H. Bao, G. Zhang, and Z. Cui, “Cp-slam: Collaborative neural point-based slam system,” in Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  41. D. Azinović, R. Martin-Brualla, D. B. Goldman, M. Nießner, and J. Thies, “Neural rgb-d surface reconstruction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6290–6301.
  42. R. Or-El, X. Luo, M. Shan, E. Shechtman, J. J. Park, and I. Kemelmacher-Shlizerman, “Stylesdf: High-resolution 3d-consistent image and geometry generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 503–13 513.
  43. J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of rgb-d slam systems,” in Proc. of the International Conference on Intelligent Robot Systems (IROS), Oct. 2012.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com