SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild (2401.10171v2)

Published 18 Jan 2024 in cs.CV and cs.GR

Abstract: We present SHINOBI, an end-to-end framework for the reconstruction of shape, material, and illumination from object images captured with varying lighting, pose, and background. Inverse rendering of an object based on unconstrained image collections is a long-standing challenge in computer vision and graphics and requires a joint optimization over shape, radiance, and pose. We show that an implicit shape representation based on a multi-resolution hash encoding enables faster and robust shape reconstruction with joint camera alignment optimization that outperforms prior work. Further, to enable the editing of illumination and object reflectance (i.e. material) we jointly optimize BRDF and illumination together with the object's shape. Our method is class-agnostic and works on in-the-wild image collections of objects to produce relightable 3D assets for several use cases such as AR/VR, movies, games, etc. Project page: https://shinobi.aengelhardt.com Video: https://www.youtube.com/watch?v=iFENQ6AcYd8&feature=youtu.be

Citations (7)

View on Semantic Scholar

Summary

The paper introduces SHINOBI, a framework that uses multi-resolution hash encoding and per-view importance weighting to robustly extract 3D shapes, materials, and illumination from casual images.
It leverages modified camera parameterization and patch-based alignment losses to stabilize pose optimization and enhance detail reconstruction.
Experimental results on the NAVI dataset demonstrate improved view synthesis, relighting, and significantly reduced run-time compared to traditional methods.

Introduction to SHINOBI

Inverse rendering of objects from images, which involves extracting 3D shapes, materials, and illumination information, presents significant challenges when working with unconstrained image collections. These images vary widely in lighting, pose and background, and are often captured with different devices. The ability to accurately reconstruct 3D assets from such images has wide applications in augmented/virtual reality (AR/VR), movies, and games.

Advancing Shape and Material Reconstruction

SHINOBI, our introduced framework, substantially advances the reconstruction of 3D shapes and material properties from in-the-wild images. Traditional methods struggle to cope with the varying conditions found in casual image collections and often result in less than ideal shape reconstructions and camera registrations. SHINOBI overcomes these limitations by employing a multi-resolution hash encoding for implicit shape representation. This approach not only achieves faster reconstruction but also robustly aligns the camera poses, surpassing the performance of previous techniques.

The Core Innovations of SHINOBI

The SHINOBI framework differentiates itself from prior work with several key features. A hybrid multi-resolution hash encoding stabilizes camera pose optimization and allows for sharper feature reconstruction. By modifying the camera parameterization and imposing additional constraints through a camera multiplex constraint, consistency across camera proposals is enforced. This process is bolstered by the introduction of per-view importance weighting, which makes iterative optimization more reliable by focusing on the most informative views. Further, SHINOBI utilizes patch-based alignment losses to enhance the image-to-3D alignment process.

Experimentation and Results

Our experiments conducted on the NAVI dataset reveal that SHINOBI not only outperforms existing methods in view synthesis and relighting tasks but also significantly reduces run-time for processing scenes. The quality of reconstruction evident in the results is sharper, showing more details than what was previously achieved. These experimental results affirm SHINOBI's capability to generate relightable 3D assets effectively from casually captured images, and showcase its potential for broad deployment in graphics applications.

Conclusion and Future Work

SHINOBI marks a substantial step forward by robustly extracting 3D shapes, materials, and illumination from unposed image collections. While it generates high-quality 3D assets compatible with various downstream graphics applications, future improvements could further refine its ability to handle symmetrical objects, transparent materials, and complex lighting conditions. As the demand for realistic 3D models continues to grow, the development of frameworks like SHINOBI will become increasingly important for various industries.

PDF Markdown

Related Papers

Tweets

https://twitter.com/_akhaliq/status/1748193885233336726

https://twitter.com/zhenjun_zhao/status/1748227941027221617

https://twitter.com/AjlEngelhardt/status/1748243988987977768

https://twitter.com/AjlEngelhardt/status/1748246250401460272

https://twitter.com/WilliamLamkin/status/1748195317307122125

https://twitter.com/arxivsanitybot/status/1748890847318606151