Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image (1804.05790v1)

Published 16 Apr 2018 in cs.CV

Abstract: We propose a material acquisition approach to recover the spatially-varying BRDF and normal map of a near-planar surface from a single image captured by a handheld mobile phone camera. Our method images the surface under arbitrary environment lighting with the flash turned on, thereby avoiding shadows while simultaneously capturing high-frequency specular highlights. We train a CNN to regress an SVBRDF and surface normals from this image. Our network is trained using a large-scale SVBRDF dataset and designed to incorporate physical insights for material estimation, including an in-network rendering layer to model appearance and a material classifier to provide additional supervision during training. We refine the results from the network using a dense CRF module whose terms are designed specifically for our task. The framework is trained end-to-end and produces high quality results for a variety of materials. We provide extensive ablation studies to evaluate our network on both synthetic and real data, while demonstrating significant improvements in comparisons with prior works.

Citations (165)

View on Semantic Scholar

Summary

The paper proposes a novel CNN system with an in-network rendering layer and material classifier to estimate SVBRDF and normals from a single mobile phone image.
Experimental results show superior performance over previous methods and demonstrate high-quality SVBRDF and normal reconstructions from real-world mobile phone images.
This research makes high-quality material acquisition more accessible, with significant implications for AR, digital content creation, and robotics by using widely available consumer technology.

Overview of "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image"

The paper "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image" presents a method for acquiring the Spatially Varying Bidirectional Reflectance Distribution Function (SVBRDF) and surface normals using a single image from a mobile phone, capturing material characteristics of near-planar surfaces under combined flash and arbitrary environmental lighting. The authors propose a system utilizing a Convolutional Neural Network (CNN) architecture tailored to physical insights into BRDF and material-light interactions, featuring an in-network rendering layer to simulate appearances and a material classifier to enhance training supervision.

Methodology

The authors introduce a novel lightweight CNN architecture optimized for SVBRDF and normal map estimation. Key features of this architecture include an encoder-decoder structure that correlates BRDF parameters with material types and a differentiable rendering layer that supports the learning of appearance characteristics under various lighting conditions. To ensure robustness across unknown devices and settings, the system leverages extensive data augmentation techniques on a comprehensive synthetic dataset, which provides high perceptual fidelity and addresses a broader array of materials than typical datasets.

The CNN implements a redistributive loss function that integrates standard L2 losses for each BRDF component and a reconstruction loss for the rendered outcome, emphasizing the importance of holistic appearance modeling. Furthermore, the network's material classifier predicts material type, guiding the latent representations towards more accurate parameter estimation.

Post-process refinement is achieved using Dense Conditional Random Fields (DCRFs), specifically adapted to consider microfacet BRDF models, thus enhancing both SVBRDF and normal accuracy by correcting potential artifacts in network predictions.

Experimental Results

The paper demonstrates superior performance over previous methods in single-image SVBRDF capture, such as those requiring complex multi-image setups or struggling with non-stationary textures. Extensive ablation studies confirm the efficacy of material classification in improving prediction precision and rendering accuracy. Some significant performance improvements are quantified, particularly noting reductions in quantitative error for reconstructed parameters and convincing visual fidelity across diverse material samples.

Beyond synthetic datasets, the authors validate their method using real-world images captured with different mobile phones, underlining the framework's relevance for practical applications. Results indicate that the proposed approach consistently delivers high-quality SVBRDF and normal reconstructions, evidenced by realistic relighting and material editing capabilities.

Implications and Future Directions

This research advances the field of material acquisition towards more accessible and practical solutions, sidestepping the traditional requirement for elaborate setups and professional equipment. Its implications extend into augmented reality (AR), digital content creation, and robotics, where realistic material rendering from minimal data is crucial. The method's adaptability to diverse devices and environments indicates robust generalization, enhancing its potential for widespread utility.

Future work may explore expansion into more complex geometries and environments, potentially integrating semantic layers to further improve material recognition and exploit contextual cues. The prospect of combining this approach with large-scale mobile applications could democratize high-quality material digitization across various domains.

Overall, this paper contributes a significant step forward in practical materials capture using widely available consumer technology, paving the way for future enhancements in the synthesis and editing of photorealistic visuals in mobile computing contexts.