Thermal-NeRF: Neural Radiance Fields from an Infrared Camera (2403.10340v1)
Abstract: In recent years, Neural Radiance Fields (NeRFs) have demonstrated significant potential in encoding highly-detailed 3D geometry and environmental appearance, positioning themselves as a promising alternative to traditional explicit representation for 3D scene reconstruction. However, the predominant reliance on RGB imaging presupposes ideal lighting conditions: a premise frequently unmet in robotic applications plagued by poor lighting or visual obstructions. This limitation overlooks the capabilities of infrared (IR) cameras, which excel in low-light detection and present a robust alternative under such adverse scenarios. To tackle these issues, we introduce Thermal-NeRF, the first method that estimates a volumetric scene representation in the form of a NeRF solely from IR imaging. By leveraging a thermal mapping and structural thermal constraint derived from the thermal characteristics of IR imaging, our method showcasing unparalleled proficiency in recovering NeRFs in visually degraded scenes where RGB-based methods fall short. We conduct extensive experiments to demonstrate that Thermal-NeRF can achieve superior quality compared to existing methods. Furthermore, we contribute a dataset for IR-based NeRF applications, paving the way for future research in IR NeRF reconstruction.
- Z. Ma and S. Liu, “A review of 3d reconstruction techniques in civil engineering and their applications,” Advanced Engineering Informatics, vol. 37, pp. 163–174, 2018.
- Z. Kang, J. Yang, Z. Yang, and S. Cheng, “A review of techniques for 3d reconstruction of indoor environments,” ISPRS International Journal of Geo-Information, vol. 9, no. 5, p. 330, 2020.
- B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
- J. Engel, V. Koltun, and D. Cremers, “Direct sparse odometry,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611–625, 2017.
- B. Curless and M. Levoy, “A volumetric method for building complex models from range images,” in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, 1996, pp. 303–312.
- H. Hoppe, T. DeRose, T. Duchamp, J. McDonald, and W. Stuetzle, “Mesh optimization,” in Proceedings of the 20th annual conference on Computer graphics and interactive techniques, 1993, pp. 19–26.
- B. Mildenhall, P. Hedman, R. Martin-Brualla, P. P. Srinivasan, and J. T. Barron, “Nerf in the dark: High dynamic range view synthesis from noisy raw images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 190–16 199.
- R. Martin-Brualla, N. Radwan, M. S. Sajjadi, J. T. Barron, A. Dosovitskiy, and D. Duckworth, “Nerf in the wild: Neural radiance fields for unconstrained photo collections,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 7210–7219.
- T. Fujitomi, K. Sakurada, R. Hamaguchi, H. Shishido, M. Onishi, and Y. Kameda, “Lb-nerf: light bending neural radiance fields for transparent medium,” in 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022, pp. 2142–2146.
- M. Vollmer, “Infrared thermal imaging,” in Computer Vision: A Reference Guide. Springer, 2021, pp. 666–670.
- K. Ko, K. Shim, K. Lee, and C. Kim, “Large-scale benchmark for uncooled infrared image deblurring,” IEEE Sensors Journal, 2023.
- X. Kuang, X. Sui, Y. Liu, Q. Chen, and G. Gu, “Single infrared image enhancement using a deep convolutional neural network,” Neurocomputing, vol. 332, pp. 119–128, 2019.
- Y. Liu, S. Liu, and Z. Wang, “A general framework for image fusion based on multi-scale transform and sparse representation,” Information fusion, vol. 24, pp. 147–164, 2015.
- R. A. Newcombe, S. J. Lovegrove, and A. J. Davison, “Dtam: Dense tracking and mapping in real-time,” in 2011 international conference on computer vision. IEEE, 2011, pp. 2320–2327.
- S. Lombardi, T. Simon, J. Saragih, G. Schwartz, A. Lehrmann, and Y. Sheikh, “Neural volumes: Learning dynamic renderable volumes from images,” arXiv preprint arXiv:1906.07751, 2019.
- J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 165–174.
- J. Deng, Q. Wu, X. Chen, S. Xia, Z. Sun, G. Liu, W. Yu, and L. Pei, “Nerf-loam: Neural implicit representation for large-scale incremental lidar odometry and mapping,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 8218–8227.
- P. Wang, L. Liu, Y. Liu, C. Theobalt, T. Komura, and W. Wang, “Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction,” arXiv preprint arXiv:2106.10689, 2021.
- N. Kulkarni, J. Johnson, and D. F. Fouhey, “Directed ray distance functions for 3d scene reconstruction,” in European Conference on Computer Vision. Springer, 2022, pp. 201–219.
- G. Metzer, E. Richardson, O. Patashnik, R. Giryes, and D. Cohen-Or, “Latent-nerf for shape-guided generation of 3d shapes and textures,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 12 663–12 673.
- Y. Xiao, Y. Zhao, Y. Xu, and S. Gao, “Resnerf: Geometry-guided residual neural radiance field for indoor scene novel view synthesis,” arXiv preprint arXiv:2211.16211, 2022.
- J. Tang, H. Zhou, X. Chen, T. Hu, E. Ding, J. Wang, and G. Zeng, “Delicate textured mesh recovery from nerf via adaptive surface refinement,” arXiv preprint arXiv:2303.02091, 2023.
- F. Bao, X. Wang, S. H. Sureshbabu, G. Sreekumar, L. Yang, V. Aggarwal, V. N. Boddeti, and Z. Jacob, “Heat-assisted detection and ranging,” Nature, vol. 619, no. 7971, pp. 743–748, 2023.
- Y. He, B. Deng, H. Wang, L. Cheng, K. Zhou, S. Cai, and F. Ciampa, “Infrared machine vision and infrared thermography with deep learning: A review,” Infrared physics & technology, vol. 116, p. 103754, 2021.
- J.-H. He, D.-P. Liu, C.-H. Chung, and H.-H. Huang, “Infrared thermography measurement for vibration-based structural health monitoring in low-visibility harsh environments,” Sensors, vol. 20, no. 24, p. 7067, 2020.
- J. L. Schönberger, E. Zheng, M. Pollefeys, and J.-M. Frahm, “Pixelwise view selection for unstructured multi-view stereo,” in European Conference on Computer Vision (ECCV), 2016.
- R. Hou, D. Zhou, R. Nie, D. Liu, L. Xiong, Y. Guo, and C. Yu, “Vif-net: An unsupervised framework for infrared and visible image fusion,” IEEE Transactions on Computational Imaging, vol. 6, pp. 640–651, 2020.
- Y. Ma, Y. Wang, X. Mei, C. Liu, X. Dai, F. Fan, and J. Huang, “Visible/infrared combined 3d reconstruction scheme based on nonrigid registration of multi-modality images with mixed features,” IEEE Access, vol. 7, pp. 19 199–19 211, 2019.
- S. Lang and K. Jäger, “3d scene reconstruction from ir image sequences for image-based navigation update and target detection of an autonomous airborne system,” in Infrared Technology and Applications XXXIV, vol. 6940. SPIE, 2008, pp. 535–543.
- M. Poggi, P. Z. Ramirez, F. Tosi, S. Salti, S. Mattoccia, and L. Di Stefano, “Cross-spectral neural radiance fields,” in 2022 International Conference on 3D Vision (3DV). IEEE, 2022, pp. 606–616.
- S. Katragadda, W. Lee, Y. Peng, P. Geneva, C. Chen, C. Guo, M. Li, and G. Huang, “Nerf-vins: A real-time neural radiance field map-based visual-inertial navigation system,” arXiv preprint arXiv:2309.09295, 2023.
- Z. Wang, S. Wu, W. Xie, M. Chen, and V. A. Prisacariu, “Nerf–: Neural radiance fields without known camera parameters,” arXiv preprint arXiv:2102.07064, 2021.
- A. M. Eskicioglu and P. S. Fisher, “Image quality measures and their performance,” IEEE Transactions on communications, vol. 43, no. 12, pp. 2959–2965, 1995.
- A. Hore and D. Ziou, “Image quality metrics: Psnr vs. ssim,” in 2010 20th international conference on pattern recognition. IEEE, 2010, pp. 2366–2369.
- Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
- Z. Xie, X. Yang, Y. Yang, Q. Sun, Y. Jiang, H. Wang, Y. Cai, and M. Sun, “S3im: Stochastic structural similarity and its unreasonable effectiveness for neural fields,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 18 024–18 034.
- M. Tancik, E. Weber, E. Ng, R. Li, B. Yi, J. Kerr, T. Wang, A. Kristoffersen, J. Austin, K. Salahi, A. Ahuja, D. McAllister, and A. Kanazawa, “Nerfstudio: A modular framework for neural radiance field development,” in ACM SIGGRAPH 2023 Conference Proceedings, ser. SIGGRAPH ’23, 2023.
- J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-nerf 360: Unbounded anti-aliased neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5470–5479.
- C. Sun, M. Sun, and H.-T. Chen, “Improved direct voxel grid optimization for radiance fields reconstruction,” arXiv preprint arXiv:2206.05085, 2022.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- W. E. Lorensen and H. E. Cline, “Marching cubes: A high resolution 3d surface construction algorithm,” in Seminal graphics: pioneering efforts that shaped the field, 1998, pp. 347–353.