Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar (2312.14239v2)

Published 21 Dec 2023 in cs.CV and eess.IV

Abstract: 3D reconstruction from a single-view is challenging because of the ambiguity from monocular cues and lack of information about occluded regions. Neural radiance fields (NeRF), while popular for view synthesis and 3D reconstruction, are typically reliant on multi-view images. Existing methods for single-view 3D reconstruction with NeRF rely on either data priors to hallucinate views of occluded regions, which may not be physically accurate, or shadows observed by RGB cameras, which are difficult to detect in ambient light and low albedo backgrounds. We propose using time-of-flight data captured by a single-photon avalanche diode to overcome these limitations. Our method models two-bounce optical paths with NeRF, using lidar transient data for supervision. By leveraging the advantages of both NeRF and two-bounce light measured by lidar, we demonstrate that we can reconstruct visible and occluded geometry without data priors or reliance on controlled ambient lighting or scene albedo. In addition, we demonstrate improved generalization under practical constraints on sensor spatial- and temporal-resolution. We believe our method is a promising direction as single-photon lidars become ubiquitous on consumer devices, such as phones, tablets, and headsets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. TöRF: Time-of-flight radiance fields for dynamic scene view synthesis. In NeurIPS, 2021.
  2. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992.
  3. Low-cost spad sensing for non-line-of-sight tracking, material classification and depth imaging. ACM Transactions on Graphics (TOG), 40(4):1–12, 2021.
  4. Edoardo Charbon. Introduction to time-of-flight imaging. In SENSORS, 2014 IEEE, pages 610–613. IEEE, 2014.
  5. Nlos-neus: Non-line-of-sight neural implicit surface. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10532–10541, 2023.
  6. Imaging behind occluders using two-bounce light. In ECCV, 2020.
  7. Bounce-flash lidar. IEEE Transactions on Computational Imaging, 8:411–424, 2022.
  8. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  9. LRM: Large reconstruction model for single image to 3D. arXiv:2311.04400, 2023.
  10. Neural lidar fields for novel view synthesis. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
  11. Putting nerf on a diet: Semantically consistent few-shot view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5885–5894, 2021.
  12. Wenzel Jakob. Mitsuba renderer, 2010. http://www.mitsuba-renderer.org.
  13. Depth fields: Extending light field techniques to time-of-flight imaging. In 2015 International Conference on 3D Vision, pages 1–9. IEEE, 2015.
  14. Deepshadow: Neural shape from shadow. In European Conference on Computer Vision, pages 415–430. Springer, 2022.
  15. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  16. Looking around the corner using transient imaging. In 2009 IEEE 12th International Conference on Computer Vision, pages 159–166. IEEE, 2009.
  17. Shape from inconsistent silhouette. Comput. Vis. Image Underst., 112:210–224, 2008.
  18. Magic3d: High-resolution text-to-3d content creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023.
  19. Shadowneus: Neural sdf reconstruction by shadow ray supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 175–185, 2023.
  20. Shadows shed light on 3D objects. arXiv:2206.08990, 2022.
  21. Zero-1-to-3: Zero-shot one image to 3D object. arXiv:2303.11328, 2023.
  22. Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics (Proceedings of SIGGRAPH), 21(4):163–169, 1987.
  23. Transient neural radiance fields for lidar view synthesis and 3D reconstruction. arXiv:2307.09555, 2023.
  24. Volumetric descriptions of objects from multiple views. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5(2):150–158, 1983.
  25. Floatingfusion: Depth from tof and image-stabilized stereo cameras. In European Conference on Computer Vision, pages 602–618. Springer, 2022.
  26. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  27. Physics to the rescue: Deep non-line-of-sight reconstruction for high-speed imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  28. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  29. Ellipsoidal path connections for time-gated rendering. ACM Transactions on Graphics (TOG), 38(4):1–12, 2019.
  30. Signal processing based pile-up compensation for gated single-photon avalanche diodes. arXiv preprint arXiv:1806.07437, 2018.
  31. PicoQuant. LDH series picosecond pulsed diode laser heads, 2023.
  32. Dreamfusion: Text-to-3d using 2d diffusion. arXiv, 2022.
  33. ZeroNVS: Zero-shot 360-degree view synthesis from a single real image. arXiv:2310.17994, 2023.
  34. Shadow carving. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, pages 190–197. IEEE, 2001.
  35. Non-line-of-sight imaging via neural transient fields. TPAMI, 43(7):2257–2268, 2021.
  36. Daniel Smith. Calculating the emission spectra from common light sources, 2016.
  37. Role of transients in two-bounce non-line-of-sight imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9192–9201, 2023.
  38. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  39. LiDAR-NeRF: Novel lidar view synthesis via neural radiance fields. arXiv preprint arXiv:2304.10406, 2023.
  40. Towards learning neural representations from shadows. In European Conference on Computer Vision, pages 300–316. Springer, 2022.
  41. Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging. Nature communications, 3(1):745, 2012.
  42. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023a.
  43. Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. arXiv preprint arXiv:2305.16213, 2023b.
  44. SinNeRF: Training neural radiance fields on complex scenes from a single image. In ECCV, 2022a.
  45. Sinnerf: Training neural radiance fields on complex scenes from a single image. In European Conference on Computer Vision, pages 736–753. Springer, 2022b.
  46. S33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPT-NeRF: Neural reflectance field from shading and shadow under a single viewpoint. In NeurIPS, 2022.
  47. pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4578–4587, 2021.
  48. HiFA: High-fidelity text-to-3D with advanced diffusion guidance. arXiv:2305.18766, 2023.
Citations (2)

Summary

  • The paper proposes a novel single-view 3D reconstruction technique that integrates two-bounce lidar data with NeRF to capture occluded areas.
  • It employs single-photon lidar transient signals to supervise neural radiance fields, enhancing reconstruction accuracy under varied lighting and sensor constraints.
  • Experimental results demonstrate improved performance over existing methods, highlighting its potential for integration in consumer lidar-equipped devices.

Background

3D reconstruction, crucial for various fields like autonomous driving and virtual reality, traditionally depends on multi-view imagery to reconstruct scene geometry. However, obtaining multiple views can be tedious and impractical in dynamic environments. Current methods for single-view reconstruction either rely on imprecise data-driven estimation of occluded areas or struggle with ambient lighting and low albedo surfaces.

Approach and Methodology

This paper introduces a new technique, termed PlatoNeRF, for reconstructing 3D scene geometry using a single view by incorporating neural radiance fields (NeRF) and two-bounce lidar signals. This method leverages single-photon lidar transient data to supervise NeRF, overcoming the challenges faced by existing single-view methods. The research notably uses time-of-flight measurements gathered from light that has bounced twice within the scene, containing key information about both the visible geometry and occluded areas.

The process involves illuminating specific points in the scene with a laser, capturing how light reflects within the scene and then uses the sensor to measure both the direct and indirect light paths. A neural network is then trained to represent the scene's geometry based on these measurements.

Results and Analysis

The developed method outperforms existing techniques relying on single-view images or two-bounce lidar systems, providing more accurate reconstructions of both visible and hidden areas. It further showcases improved generalization under various practical constraints, such as reduced sensor resolution and different ambient light conditions. The results demonstrate the method’s robustness and its potential application in consumer devices with built-in lidar technology.

Contributions and Implications

The main contributions of this work include a novel two-bounce lidar and NeRF model, the capability for single-view 3D reconstruction without detailed scene information, and an in-depth analysis illustrating the method's resilience to various environmental factors. Additionally, the authors have generated a simulated dataset to facilitate further research and will make it available to the public. This innovation signifies a promising step forward, particularly as single-photon lidar sensors become more common in everyday devices. Future improvements may include better handling of non-Lambertian surfaces and the removal of occasional artifacts in reconstructions.

Github Logo Streamline Icon: https://streamlinehq.com