Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Leveraging Neural Radiance Fields for Uncertainty-Aware Visual Localization (2310.06984v1)

Published 10 Oct 2023 in cs.CV and cs.RO

Abstract: As a promising fashion for visual localization, scene coordinate regression (SCR) has seen tremendous progress in the past decade. Most recent methods usually adopt neural networks to learn the mapping from image pixels to 3D scene coordinates, which requires a vast amount of annotated training data. We propose to leverage Neural Radiance Fields (NeRF) to generate training samples for SCR. Despite NeRF's efficiency in rendering, many of the rendered data are polluted by artifacts or only contain minimal information gain, which can hinder the regression accuracy or bring unnecessary computational costs with redundant data. These challenges are addressed in three folds in this paper: (1) A NeRF is designed to separately predict uncertainties for the rendered color and depth images, which reveal data reliability at the pixel level. (2) SCR is formulated as deep evidential learning with epistemic uncertainty, which is used to evaluate information gain and scene coordinate quality. (3) Based on the three arts of uncertainties, a novel view selection policy is formed that significantly improves data efficiency. Experiments on public datasets demonstrate that our method could select the samples that bring the most information gain and promote the performance with the highest efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (84)
  1. X. Li, S. Wang, Y. Zhao, J. Verbeek, and J. Kannala, “Hierarchical scene coordinate classification and regression for visual localization,” in CVPR, 2020, pp. 11 983–11 992.
  2. Z. Huang, H. Zhou, Y. Li, B. Yang, Y. Xu, X. Zhou, H. Bao, G. Zhang, and H. Li, “Vs-net: Voting with segmentation for visual localization,” in CVPR, 2021, pp. 6101–6111.
  3. S. Tang, C. Tang, R. Huang, S. Zhu, and P. Tan, “Learning camera localization via dense scene matching,” in CVPR, 2021, pp. 1831–1841.
  4. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” CACM, vol. 65, no. 1, pp. 99–106, 2021.
  5. A. Kendall and R. Cipolla, “Geometric loss functions for camera pose regression with deep learning,” in CVPR, 2017, pp. 5974–5983.
  6. A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization,” in ICCV, 2015, pp. 2938–2946.
  7. T. Sattler, Q. Zhou, M. Pollefeys, and L. Leal-Taixe, “Understanding the limitations of cnn-based absolute camera pose regression,” in CVPR, 2019, pp. 3302–3312.
  8. F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, S. Hilsenbeck, and D. Cremers, “Image-based localization using lstms for structured feature correlation,” in ICCV, 2017, pp. 627–637.
  9. S. Chen, Z. Wang, and V. Prisacariu, “Direct-posenet: Absolute pose regression with photometric consistency,” in 3DV.   IEEE, 2021, pp. 1175–1185.
  10. S. Chen, X. Li, Z. Wang, and V. Prisacariu, “DFNet: Enhance absolute pose regression with direct feature matching,” in ECCV, 2022.
  11. T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt, “Image retrieval for image-based localization revisited.” in BMVC, vol. 1, no. 2, 2012, p. 4.
  12. R. Arandjelovic and A. Zisserman, “All about vlad,” in CVPR, 2013, pp. 1578–1585.
  13. A. Torii, R. Arandjelovic, J. Sivic, M. Okutomi, and T. Pajdla, “24/7 place recognition by view synthesis,” in CVPR, 2015, pp. 1808–1817.
  14. R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “Netvlad: Cnn architecture for weakly supervised place recognition,” in CVPR, 2016, pp. 5297–5307.
  15. M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler, “D2-net: A trainable cnn for joint description and detection of local features,” in CVPR, 2019, pp. 8092–8101.
  16. P.-E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, “From coarse to fine: Robust hierarchical localization at large scale,” in CVPR, 2019, pp. 12 716–12 725.
  17. D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised interest point detection and description,” in CVPR workshops, 2018, pp. 224–236.
  18. L. Zhang and S. Rusinkiewicz, “Learning to detect features in texture images,” in CVPR, 2018, pp. 6325–6333.
  19. H. Zhou, T. Sattler, and D. W. Jacobs, “Evaluating local features for day-night matching,” in ECCV.   Springer, 2016, pp. 724–736.
  20. J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgibbon, “Scene coordinate regression forests for camera relocalization in rgb-d images,” in CVPR, 2013, pp. 2930–2937.
  21. E. Brachmann and C. Rother, “Visual camera re-localization from RGB and RGB-D images using DSAC,” TPAMI, 2021.
  22. T. Sattler, B. Leibe, and L. Kobbelt, “Efficient & effective prioritized matching for large-scale image-based localization,” TPAMI, vol. 39, no. 9, pp. 1744–1756, 2016.
  23. ——, “Improving image-based localization by active correspondence search,” in ECCV.   Springer, 2012, pp. 752–765.
  24. H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii, “Inloc: Indoor visual localization with dense matching and view synthesis,” in CVPR, 2018, pp. 7199–7209.
  25. P.-E. Sarlin, A. Unagar, M. Larsson, H. Germain, C. Toft, V. Larsson, M. Pollefeys, V. Lepetit, L. Hammarstrand, F. Kahl, et al., “Back to the feature: Learning robust camera localization from pixels to pose,” in CVPR, 2021, pp. 3247–3257.
  26. P. Lindenberger, P.-E. Sarlin, V. Larsson, and M. Pollefeys, “Pixel-perfect structure-from-motion with featuremetric refinement,” in ICCV, 2021, pp. 5987–5997.
  27. T. Cavallari, S. Golodetz, N. A. Lord, J. Valentin, L. Di Stefano, and P. H. Torr, “On-the-fly adaptation of regression forests for online camera relocalisation,” in CVPR, 2017, pp. 4457–4466.
  28. L. Meng, J. Chen, F. Tung, J. J. Little, J. Valentin, and C. W. de Silva, “Backtracking regression forests for accurate camera relocalization,” in IROS.   IEEE, 2017, pp. 6886–6893.
  29. L. Meng, F. Tung, J. J. Little, J. Valentin, and C. W. de Silva, “Exploiting points and lines in regression forests for rgb-d camera relocalization,” in IROS.   IEEE, 2018, pp. 6827–6834.
  30. J. Valentin, M. Nießner, J. Shotton, A. Fitzgibbon, S. Izadi, and P. H. Torr, “Exploiting uncertainty in regression forests for accurate camera relocalization,” in CVPR, 2015, pp. 4400–4408.
  31. J. Valentin, A. Dai, M. Nießner, P. Kohli, P. Torr, S. Izadi, and C. Keskin, “Learning to navigate the energy landscape,” in 3DV.   IEEE, 2016, pp. 323–332.
  32. E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, and C. Rother, “Dsac-differentiable ransac for camera localization,” in CVPR, 2017, pp. 6684–6692.
  33. E. Brachmann and C. Rother, “Learning less is more-6d camera localization via 3d surface regression,” in CVPR, 2018, pp. 4654–4662.
  34. B. Roessle, J. T. Barron, B. Mildenhall, P. P. Srinivasan, and M. Nießner, “Dense depth priors for neural radiance fields from sparse input views,” in CVPR, 2022, pp. 12 892–12 901.
  35. Y. Wei, S. Liu, Y. Rao, W. Zhao, J. Lu, and J. Zhou, “Nerfingmvs: Guided optimization of neural radiance fields for indoor multi-view stereo,” in ICCV, 2021, pp. 5610–5619.
  36. K. Deng, A. Liu, J.-Y. Zhu, and D. Ramanan, “Depth-supervised nerf: Fewer views and faster training for free,” in CVPR, 2022, pp. 12 882–12 891.
  37. A. Chen, Z. Xu, F. Zhao, X. Zhang, F. Xiang, J. Yu, and H. Su, “Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo,” in ICCV, 2021, pp. 14 124–14 133.
  38. A. Yu, V. Ye, M. Tancik, and A. Kanazawa, “pixelnerf: Neural radiance fields from one or few images,” in CVPR, 2021, pp. 4578–4587.
  39. D. Xu, Y. Jiang, P. Wang, Z. Fan, H. Shi, and Z. Wang, “Sinnerf: Training neural radiance fields on complex scenes from a single image,” arXiv preprint arXiv:2204.00928, 2022.
  40. Z. Wang, S. Wu, W. Xie, M. Chen, and V. A. Prisacariu, “Nerf–: Neural radiance fields without known camera parameters,” arXiv preprint arXiv:2102.07064, 2021.
  41. C.-H. Lin, W.-C. Ma, A. Torralba, and S. Lucey, “Barf: Bundle-adjusting neural radiance fields,” in ICCV, 2021, pp. 5741–5751.
  42. A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “Tensorf: Tensorial radiance fields,” in ECCV, 2022.
  43. C. Sun, M. Sun, and H.-T. Chen, “Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction,” in CVPR, 2022, pp. 5459–5469.
  44. S. J. Garbin, M. Kowalski, M. Johnson, J. Shotton, and J. Valentin, “Fastnerf: High-fidelity neural rendering at 200fps,” in ICCV, 2021, pp. 14 346–14 355.
  45. C. Reiser, S. Peng, Y. Liao, and A. Geiger, “Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps,” in ICCV, 2021, pp. 14 335–14 345.
  46. T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,” ACM TOG, vol. 41, no. 4, pp. 102:1–102:15, July 2022. [Online]. Available: https://doi.org/10.1145/3528223.3530127
  47. M. Tancik, V. Casser, X. Yan, S. Pradhan, B. Mildenhall, P. P. Srinivasan, J. T. Barron, and H. Kretzschmar, “Block-nerf: Scalable large scene neural view synthesis,” in CVPR, 2022, pp. 8248–8258.
  48. K. Rematas, A. Liu, P. P. Srinivasan, J. T. Barron, A. Tagliasacchi, T. Funkhouser, and V. Ferrari, “Urban radiance fields,” in CVPR, 2022, pp. 12 932–12 942.
  49. H. Turki, D. Ramanan, and M. Satyanarayanan, “Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs,” in CVPR, 2022, pp. 12 922–12 931.
  50. M. Adamkiewicz, T. Chen, A. Caccavale, R. Gardner, P. Culbertson, J. Bohg, and M. Schwager, “Vision-only robot navigation in a neural radiance world,” IEEE RA-L, vol. 7, no. 2, pp. 4606–4613, 2022.
  51. J. Ichnowski*, Y. Avigal*, J. Kerr, and K. Goldberg, “Dex-NeRF: Using a neural radiance field to grasp transparent objects,” in CoRL, 2020.
  52. S. Lee, L. Chen, J. Wang, A. Liniger, S. Kumar, and F. Yu, “Uncertainty guided policy for active robotic 3d reconstruction using neural radiance fields,” IEEE RA-L, 2022.
  53. L. Yen-Chen, P. Florence, J. T. Barron, T.-Y. Lin, A. Rodriguez, and P. Isola, “Nerf-supervision: Learning dense object descriptors from neural radiance fields,” arXiv preprint arXiv:2203.01913, 2022.
  54. Z. Zhang, T. Sattler, and D. Scaramuzza, “Reference pose generation for long-term visual localization via learned features and view synthesis,” IJCV, vol. 129, no. 4, pp. 821–844, 2021.
  55. L. Yen-Chen, P. Florence, J. T. Barron, A. Rodriguez, P. Isola, and T.-Y. Lin, “inerf: Inverting neural radiance fields for pose estimation,” in IROS.   IEEE, 2021, pp. 1323–1330.
  56. A. Moreau, N. Piasco, D. Tsishkou, B. Stanciulescu, and A. de La Fortelle, “Lens: Localization enhanced by nerf synthesis,” in CoRL.   PMLR, 2022, pp. 1347–1356.
  57. R. Martin-Brualla, N. Radwan, M. S. Sajjadi, J. T. Barron, A. Dosovitskiy, and D. Duckworth, “Nerf in the wild: Neural radiance fields for unconstrained photo collections,” in CVPR, 2021, pp. 7210–7219.
  58. A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deep learning for computer vision?” NeurIPS, vol. 30, 2017.
  59. G. Georgakis, B. Bucher, K. Schmeckpeper, S. Singh, and K. Daniilidis, “Learning to map for active semantic goal navigation,” arXiv preprint arXiv:2106.15648, 2021.
  60. G. Georgakis, B. Bucher, A. Arapin, K. Schmeckpeper, N. Matni, and K. Daniilidis, “Uncertainty-driven planner for exploration and navigation,” arXiv preprint arXiv:2202.11907, 2022.
  61. D. J. MacKay, “Bayesian neural networks and density networks,” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 354, no. 1, pp. 73–80, 1995.
  62. I. Kononenko, “Bayesian neural networks,” Biological Cybernetics, vol. 61, no. 5, pp. 361–370, 1989.
  63. Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in ICML.   PMLR, 2016, pp. 1050–1059.
  64. D. P. Kingma, T. Salimans, and M. Welling, “Variational dropout and the local reparameterization trick,” NeurIPS, vol. 28, 2015.
  65. B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” NeurIPS, vol. 30, 2017.
  66. S. Jain, G. Liu, J. Mueller, and D. Gifford, “Maximizing overall diversity for improved uncertainty estimates in deep ensembles,” in AAAI, vol. 34, no. 04, 2020, pp. 4264–4271.
  67. M. Sensoy, L. Kaplan, and M. Kandemir, “Evidential deep learning to quantify classification uncertainty,” NeurIPS, vol. 31, 2018.
  68. A. Amini, W. Schwarting, A. Soleimany, and D. Rus, “Deep evidential regression,” NeurIPS, vol. 33, pp. 14 927–14 937, 2020.
  69. J. Shen, A. Ruiz, A. Agudo, and F. Moreno-Noguer, “Stochastic neural radiance fields: Quantifying uncertainty in implicit 3d representations,” in 3DV.   IEEE, 2021, pp. 972–981.
  70. J. Shen, A. Agudo, F. Moreno-Noguer, and A. Ruiz, “Conditional-flow nerf: Accurate 3d modelling with reliable uncertainty quantification,” in ECCV, 2022.
  71. X. Pan, Z. Lai, S. Song, and G. Huang, “Activenerf: Learning where to see with uncertainty estimation,” in ECCV, 2022.
  72. N. Max, “Optical models for direct volume rendering,” IEEE TVCG, vol. 1, no. 2, pp. 99–108, 1995.
  73. M. Seitzer, A. Tavakoli, D. Antic, and G. Martius, “On the pitfalls of heteroscedastic uncertainty estimation with probabilistic neural networks,” in ICLR, Apr. 2022. [Online]. Available: https://openreview.net/forum?id=aPOpXlnV1T
  74. D. Pathak, D. Gandhi, and A. Gupta, “Self-supervised exploration via disagreement,” in ICML.   PMLR, 2019, pp. 5062–5071.
  75. H. S. Seung, M. Opper, and H. Sompolinsky, “Query by committee,” in COLT, 1992, pp. 287–294.
  76. A. Yu, R. Li, M. Tancik, H. Li, R. Ng, and A. Kanazawa, “Plenoctrees for real-time rendering of neural radiance fields,” in ICCV, 2021, pp. 5752–5761.
  77. T. Ng, A. Lopez-Rodriguez, V. Balntas, and K. Mikolajczyk, “Reassessing the limitations of cnn methods for camera pose regression,” 3DV, 2021.
  78. S. Dong, S. Wang, Y. Zhuang, J. Kannala, M. Pollefeys, and B. Chen, “Visual localization via few-shot scene region classification,” arXiv preprint arXiv:2208.06933, 2022.
  79. J. Straub, T. Whelan, L. Ma, Y. Chen, E. Wijmans, S. Green, J. J. Engel, R. Mur-Artal, C. Ren, S. Verma, et al., “The replica dataset: A digital replica of indoor spaces,” arXiv preprint arXiv:1906.05797, 2019.
  80. S. Zhi, T. Laidlow, S. Leutenegger, and A. J. Davison, “In-place scene labelling and understanding with implicit scene representation,” in ICCV, 2021, pp. 15 838–15 847.
  81. E. Sucar, S. Liu, J. Ortiz, and A. J. Davison, “imap: Implicit mapping and positioning in real-time,” in ICCV, 2021, pp. 6229–6238.
  82. A. Rosinol, J. J. Leonard, and L. Carlone, “Nerf-slam: Real-time dense monocular slam with neural radiance fields,” arXiv preprint arXiv:2210.13641, 2022.
  83. K. Yang, M. Firman, E. Brachmann, and C. Godard, “Camera pose estimation and localization with active audio sensing,” in ECCV.   Springer, 2022, pp. 271–291.
  84. A. Nichol, J. Achiam, and J. Schulman, “On first-order meta-learning algorithms,” arXiv preprint arXiv:1803.02999, 2018.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.