URHand: Universal Relightable Hands (2401.05334v1)

Published 10 Jan 2024 in cs.CV and cs.GR

Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations. To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities. The key challenge is scaling the cross-identity training while maintaining personalized fidelity and sharp details without compromising generalization under natural illuminations. To this end, we propose a spatially varying linear lighting model as the neural renderer that takes physics-inspired shading as input feature. By removing non-linear activations and bias, our specifically designed lighting model explicitly keeps the linearity of light transport. This enables single-stage training from light-stage data while generalizing to real-time rendering under arbitrary continuous illuminations across diverse identities. In addition, we introduce the joint learning of a physically based model and our neural relighting model, which further improves fidelity and generalization. Extensive experiments show that our approach achieves superior performance over existing methods in terms of both quality and generalizability. We also demonstrate quick personalization of URHand from a short phone scan of an unseen identity.

References (71)

Citations (6)

View on Semantic Scholar

Summary

The paper presents a model that creates personalized, photorealistic hand renders with real-time relighting using simple mobile phone captures.
It employs a spatially varying linear lighting model alongside a dual branch (geometry and neural) approach to achieve accurate light transport.
Quantitative experiments demonstrate that URHand outperforms existing methods in adaptability, efficiency, and rendering accuracy under diverse illuminations.

Overview of the URHand Model

The paper presents URHand, a cutting-edge model dedicated to creating photorealistic, relightable models of human hands that are generalizable across different identities, viewpoints, poses, and lighting conditions. In digital mediums like video games and virtual environments, hands are omnipresent and central to user interaction. The ability to relight hands in real time to match varying lighting conditions is crucial for immersive experiences. However, current methods to achieve such realistic hand models tend to be resource-intensive, lacking generalizability, and requiring extensive data capturing processes for each identity.

Key Innovations

URHand addresses these challenges by streamlining the personalization process, essentially allowing the use of simple mobile phone captures to create a personalized hand model capable of real-time photorealistic rendering under various lighting scenarios. The team achieves this by developing a robust universal relightable prior, which is trained on a diverse dataset captured from multiview images of hands in a light stage setup. This innovation surpasses previous limitations, accommodating for personalized details and fidelity without sacrificing the model's ability to generalize across different lighting conditions.

Technical Approach

To maintain a high level of detail while preserving the linearity of lighting (an essential aspect for realistic relighting), the model employs a spatially varying linear lighting model as its neural renderer. This innovative model structure is designed without the non-linear activation functions and bias typically found in neural networks, ensuring that the light transport behaves linearly according to physics principles. As a result, URHand can be trained with light-stage data and then used to render hands under arbitrary continuous illuminations in real time with impressive accuracy.

Moreover, the paper introduces a dual approach combining practical geometry estimation with neural relighting. Through this hybrid framework, the physical branch focuses on refining hand geometry and producing precise shading features, while the neural branch manages the complex light interplay, such as subsurface scattering. Both branches are simultaneously optimized with specially tailored loss functions to enhance the quality and detail of the final relighting outcome.

Results and Potential

Quantitative experiments and extensive ablation studies validate the superiority of URHand over existing methods, both in quality and the ability to adapt to novel situations. Pivotal to this endeavor is their strategy of combining the strengths of physically based rendering with the versatility of data-driven neural approaches. This combination unlocks powerful realism and flexibility previously unattainable in real-time applications. Moreover, the paper demonstrates URHand's capability of quick personalization from a casual phone scan, making it a pioneer in easily accessible, realistic, and relightable hand modeling.

PDF Markdown

Tweets

https://twitter.com/liuziwei7/status/1745473900925120706

https://twitter.com/fly51fly/status/1745563236572008853

https://twitter.com/semisance/status/1745409500117410043

https://twitter.com/javaeeeee1/status/1745426698454134865