Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Seeing the World through Your Eyes (2306.09348v2)

Published 15 Jun 2023 in cs.CV

Abstract: The reflective nature of the human eye is an underappreciated source of information about what the world around us looks like. By imaging the eyes of a moving person, we can collect multiple views of a scene outside the camera's direct line of sight through the reflections in the eyes. In this paper, we reconstruct a 3D scene beyond the camera's line of sight using portrait images containing eye reflections. This task is challenging due to 1) the difficulty of accurately estimating eye poses and 2) the entangled appearance of the eye iris and the scene reflections. Our method jointly refines the cornea poses, the radiance field depicting the scene, and the observer's eye iris texture. We further propose a simple regularization prior on the iris texture pattern to improve reconstruction quality. Through various experiments on synthetic and real-world captures featuring people with varied eye colors, we demonstrate the feasibility of our approach to recover 3D scenes using eye reflections.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Learning neural light fields with ray-space embedding. In CVPR, 2022.
  2. A theory of catadioptric image formation. ICCV, 1998.
  3. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In ICCV, 2021.
  4. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In CVPR, 2022.
  5. Nope-nerf: Optimising neural radiance field with no pose prior. In CVPR, 2023.
  6. Turning corners into cameras: Principles and methods. In ICCV, 2017.
  7. Pandora: Polarization-aided neural decomposition of radiance. In ECCV, 2022.
  8. Nlos-neus: Non-line-of-sight neural implicit surface. arXiv preprint arXiv:2303.12280, 2023.
  9. Blind separation of superimposed moving images using image statistics. IEEE TPAMI, 2012.
  10. Dynamic view synthesis from dynamic monocular video. In ICCV, 2021.
  11. Nerfren: Neural radiance fields with reflections. In CVPR, 2022.
  12. Identifiable images of bystanders extracted from corneal reflections. PloS one, 8(12):e83325, 2013.
  13. Occluded imaging with time-of-flight sensors. ACM Transactions on Graphics, 2016.
  14. Segment anything. arXiv:2304.02643, 2023.
  15. Ellseg: An ellipse segmentation framework for robust gaze tracking. IEEE Transactions on Visualization and Computer Graphics, 2020.
  16. Single image layer separation using relative smoothness. In CVPR, 2014.
  17. Neural scene flow fields for space-time view synthesis of dynamic scenes. In CVPR, 2021.
  18. Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.
  19. Wave-based non-line-of-sight imaging using fast fk migration. ACM Transactions on Graphics, 2019.
  20. Humans as light bulbs: 3d human reconstruction from thermal reflection. In CVPR, 2023.
  21. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv preprint arXiv:2303.05499, 2023.
  22. Robust dynamic radiance fields. In CVPR, 2023.
  23. Learning to see through obstructions. In CVPR, 2020.
  24. Learning to see through obstructions with layered decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):8387–8402, 2021.
  25. Progressively optimized local radiance fields for robust view synthesis. In CVPR, 2023.
  26. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  27. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 2022.
  28. Using eye reflections for face recognition under varying illumination. In ICCV, 2005.
  29. Eyes for relighting. ACM SIGGRAPH, 2004.
  30. The World in an Eye. CVPR, 2004.
  31. Corneal Imaging System: Environment from Eyes. IJCV, 2006.
  32. Confocal non-line-of-sight imaging based on the light-cone transform. Nature, 555, 2018.
  33. A model for the human cornea: Constitutive formulation and numerical analysis. Biomechanics and Modeling in Mechanobiology, 5(4), 2006.
  34. Nerfies: Deformable neural radiance fields. In ICCV, 2021.
  35. Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM TOG (Proc. SIGGRAPH), 2021.
  36. D-nerf: Neural radiance fields for dynamic scenes. In CVPR, 2021.
  37. Reflection removal using a dual-pixel sensor. In CVPR, 2019.
  38. What you can learn by staring at a blank wall. In ICCV, 2021.
  39. Non-line-of-sight imaging via neural transient fields. IEEE TPAMI, 2021.
  40. Image-based rendering for scenes with reflections. ACM Transactions on Graphics, 2012.
  41. A perspective on distortions. In CVPR, 2003.
  42. Estimation of a focused object using a corneal surface image for eye-based interaction. Journal of eye movement research, 7(3):1–9, 2014.
  43. Nerfstudio: A modular framework for neural radiance field development. arXiv preprint arXiv:2302.04264, 2023.
  44. Orca: Glossy objects as radiance-field cameras. In CVPR, 2023.
  45. Accidental pinhole and pinspeck cameras: Revealing the scene outside the picture. In CVPR, 2012.
  46. Estimating the directions to light sources using images of eye for reconstructing 3d human face. International Conference on Communications in Computing, 2003.
  47. Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging. Nature Communications, 2012.
  48. Ref-nerf: Structured view-dependent appearance for neural radiance fields. CVPR, 2022.
  49. Depth of field guided reflection removal. In IEEE International Conference on Image Processing, 2016.
  50. NeRF−⁣−--- -: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064, 2021.
  51. A computational approach for obstruction-free photography. ACM Transactions on Graphics, 2015.
  52. Plenoxels: Radiance fields without neural networks. In CVPR, 2021.
  53. Reflectouch: Detecting grasp posture of smartphone using corneal reflection images. In CHI Conference on Human Factors in Computing Systems, 2022.
  54. Nerfactor: Neural factorization of shape and reflectance under an unknown illumination. ACM Transactions on Graphics, 2021.
  55. Single image reflection separation with perceptual losses. In CVPR, 2018.
Citations (13)

Summary

  • The paper presents a novel approach that harnesses human eye reflections with neural radiance fields to reconstruct 3D scenes.
  • It refines cornea poses while employing a radial prior-based texture decomposition to disentangle iris patterns from environmental reflections.
  • Experimental validation on synthetic and real-world data shows improved image similarity and perceptual quality, confirmed by SSIM and LPIPS metrics.

Overview of "Seeing the World through Your Eyes"

The paper "Seeing the World through Your Eyes" by Hadi Alzayer, Kevin Zhang, Brandon Feng, Christopher Metzler, and Jia-Bin Huang presents a novel approach to reconstructing 3D scenes using reflections captured from human eyes. Leveraging the natural reflective properties of the human cornea, this work introduces a method to approximate the observer's view of a scene in three dimensions without requiring the camera to be directly oriented toward the scene. The approach exploits the cornea as a catadioptric system, and effectively turns it into a mirror, capturing reflections of the surrounding environment.

Methodology

The authors utilize a stationary camera setup to capture the reflections from a person's eyes as they naturally move their heads. The primary challenges addressed include estimating accurate cornea poses and resolving the mixed signal of eye iris textures and scene reflections. The proposed method refines the cornea poses, reconstructs the 3D scene via neural radiance fields (NeRF), and models the observer's iris texture concurrently.

  1. Radiance Field Reconstruction: The method modifies standard NeRFs to use reflections from the cornea, treating eye reflections as indirect views of the scene. It calculates rays that reflect off the cornea surface, thereby allowing the NeRF to synthesize views and reconstruct the 3D scene from these non-traditional angles.
  2. Texture Decomposition: Recognizing the challenging entanglement of detailed iris textures and scene reflections, the approach introduces a 2D texture decomposition field. This field capitalizes on a radial prior to help isolate and mitigate the confounding influence of the iris pattern on the scene rendering.
  3. Cornea Pose Refinement: Accurate pose estimation of the cornea is crucial for viable multi-view reconstruction. An optimization strategy refines the initial pose guesses, adjusting for the intrinsic difficulties of eye localization and head movement variability.

Experimental Validation

The paper validates the proposed methodology with synthetic and real-world experiments. Synthetic data tests in controlled environments demonstrate the robustness of the approach against pose estimation noise. In real-world setups, the method captures reflections using a static camera and real-time head movements, successfully reconstructing scenes under varied lighting and cornea visibility conditions.

Key numerical results underscore the effectiveness of texture decomposition and pose optimization in enhancing reconstruction fidelity. Quantitative metrics (SSIM and LPIPS) highlight improvements in image similarity and perceptual quality when employing these enhancements.

Implications and Future Directions

This work presents significant implications for extending the capabilities of non-line-of-sight imaging using biological features. By reducing reliance on specialized equipment and leveraging ubiquitous human interactions, such as looking, the approach has potential applications in surveillance, entertainment, and augmented reality. The methodology envisions further advancements in integrating accidently captured imagery into coherent physical reconstructions, offering new insights into passive imaging systems in dynamic environments.

Future research could address the limitations of controlled settings by exploring broader applications, such as dynamic movement scenarios and diverse environmental conditions. Enhancements in iris texture modeling and pose estimation could foster even more robust solutions for various practical settings. The integration of machine learning with implicit scene understanding represents a promising avenue for bridging gaps between vision technology and natural human behavior.

Youtube Logo Streamline Icon: https://streamlinehq.com